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ABSTRACT 

The objective of this study is to identify some of 
the structural features of an elementary logic curriculum which 
affect logic problem difficulty. The system under review is a 
computer-based logic instructional system (LIS) at Stanford 
University. Four inodes of problem presentation multiple— choice, 
truth-analysis, counterexample, and derive — are described. Various 
empirical measures of problem difficulty and measures of problem 
structure (including structural variables, standard proof variables, 
and sequential variables) are considered. The performance of college 
students using the system is analyzed, and variables which contribute 
to the difficulty of a problem are identified. (JK) 
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CHAPTER I 



INTRODUCTION 

The investigation described in this dissertation is partially- 
motivated by a desire to focus attention on certain deficiencies in 
computer-assisted instruction (CAI) research. The current emphasis 
in CAI research is on exploring and discovering new ways in which 
humans and computers can interact. This involves the design of special 
hardware and the implementation of new programming techniques (software). 
The reader is referred to Wexler ( 1970 ) for a brief account of the 
historical development of CAI. In his account, one can clearly see that 
the primary emphasis in CAI research projects has been system development. 
Most of the projects have been implemented under ideal operating condi- 
tions for small and highly motivated groups of students, while little or 
no attention has been given to evaluating the curriculum or the pedagogi- 
cal methods used. 

Usually, the system is designed to simulate some intuitive concept 
of a "good teacher" and to "individualize instruction." The result has 
been a large collection of complex, interesting, and, from a computer 
scientist's point of view, valuable instructional systems. However, 
little machinery is available to judge their educational value or 
relevance in any systematic or quantitative way. 

For several years, the CAI Laboratory at the Institute for 
Mathematical Studies in the Social Sciences (IMSSS) has been offering 
a course in mathematical logic. The availability of this course has 
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made it possible to collect information on student behavior in elementary 
mathematical logic, information which was unavailable before the advent of 
CAI. This dissertation is a first attempt at an in-depth analysis of some 
of the factors which contribute to problem difficulty in elementary 
mathematical logic. The major focus of the study is to develop an under- 
standing which might eventually lead to a quantitative theory of problem 
solving in logic. This work is in the spirit of the analyses in elementary 
mathematics to be found in Suppes, Jerman and Brian (1968) and in Loft us 
(1970). My approach involves formally describing the relationship between 
structural features of logic problems and problem difficulty, as well as 
the development of models which predict difficulty as a function of 
curriculum structure. 

Unfortunately, the researcher interested in utilizing an operational 
CAI system faces many novel problems. In the remainder of this chapter 
I shall mention some of these problems because I consider them important 
and relevant to discussions of CAI. However, it is not the purpose of 
this dissertation to provide detailed discussion or present serious 

) 

evidence on these problems. 

Hot only does the computer provide us with the ability to create a 
large number of new educational environments, but it also provides us with 
a capability for recording and preserving many aspects of student behavior. 
However, the utilization of this data-collection capability presents 
several problems. In a large-scale CAI system, such as the one at IMSSS 
where many CAI programs are being run concurrently, it is possible to 
become inundated with student-response data. As the volume of data 
collected increases, system reliability goes down and computer-response 
time goes up. Also, the time and overhead required to remove the data 
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from high-cost, short-term devices to low-cost, long-term devices in- 
creases. It is a serious mistake to overload the system by indiscrim- 
inately recording every student response. To minimize the amount of data 
collected, one must plan carefully what is required for a particular study. 

This planning must involve another feature of computer systems, 
namely., the finite probability of system failure and its effect on the 
data. Systems can and do fail, and data are unavoidably lost because it 
is not economically feasible to have duplicate backup facilities for 
educational systems. As a result, it is not always possible to implement 
carefully controlled experimental designs or paradigms on a large-scale 
operational CAI system. 

The problems facing the data collector are categorized under three 
major headings. First, are problems which arise as a result of hardware 
and/or software failure. The failure of any component may result in a 
serious curtailment or cessation of operations. Failures usually have 
an adverse effect on data collection, the chief effect being an 
unrecoverable loss of a part of the data. Precautions can be taken to 
minimize the loss of data, but the loss cannot be predicted or entirely 

prevented. 

In a CAI classroom, the second major area of concern is the 
student -proctor interaction. A proctor is the person who supervises 
and aids students while they are at the computer terminals. In the CAI 
system at Stanford University, personnel who serve as proctors vary 
widely in training and background. In some elementary schools, there 
are full-time proctors on duty, while at other schools with fewer ter- 
minals, the classroom teachers serve as proctors. In college courses, 
the teaching assistant usually serves as the proctor, or, in some cases, 
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the proctor is a subject-area expert. Since there has been no attempt 
to set up general guidelines for CAI proctors, they are likely to see 
their roles differently and, thus, to differ in the amount and kind of 
aid that they give to their students. If research done in CAI is to 
have widespread application, then more thought must be given to standar- 
dize the proctor's role. If the proctor also happens to be the teacher 
of the course, he may sometimes see attempts at standardizing procedures 
(at the terminals) as conflicting with his teaching goals. This conflict 
must be reduced if we wish to use CAI as a research tool. 

The third and final area of concern is curriculum writing, 
particularly those parts of the curriculum written specifically for 
the computer. Ideally, from the point of view of the educational 
researcher, a curriculum should be designed to provide evidence for 
evaluating the hypotheses on which it is based. Frequently, teachers 
of the courses and administrators may not share the researcher's zeal 
for a neat experimental design. Often the curriculum already exists, 
and curriculum writers are not inclined to rewrite their material for 
the researcher's sake. In most cases, there is no empirical evidence 
to convince a teacher or curriculum writer that the changes will be of 
benefit to his students. Thus, researchers often find it necessary to 
develop techniques for examining already existing curricula. As the 
understanding of a particular curriculum grows, the researcher may 
be able to present more objective reasons why a particular curriculum 
should be changed. Thus, a curriculum can be changed in ways which are 
beneficial to the student and to the educational researcher. 

Many of the difficulties mentioned above are sufficiently complex 
to provide, in themselves, the basis for a major study. Therefore, as 
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has been previously state, the scope of this investigation was limited 
to the extent discussed in Chapter II, namely, the area of curriculum 
study. I do feel that the problems mentioned in this section are 
important, and I hope the discussion will stimulate further in-depth 
study of them. 
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CHAPTER II 



DEFINITION OF THE PROBLEM 

The primary objective of this dissertation is to identify some of 
the structural features of an elementary logic curriculum which affect 
logic problem difficulty. A related task is to provide an adequate 
behavioral measure of problem difficulty as well as an objective, 
quantitative characterization of the curriculum structure. In this 
chapter, a detailed description of the curriculum under consideration 
is presented, followed by a discussion of problem difficulty and curri- 
culum structure. 

The study involves several aspects of the existing computer-based 
logic instructional system (LIS) at Stanford University. The term 'logic 
instructional system* is used to emphasize that this is the investigation 
of a specific curriculum in the context of a large-scale CAI system. The 
computer configuration under consideration is a modified Digital Equipment 
Corporation (DEC) PDP-10 time-sharing system located at IMSSS. 

Becoming operational at Stanford in 19^3 > LIS was originally- 
designed as a self-contained tutorial program to teach sentential logic 
to bright elementary-school children. It was first implemented on the 
DEC PDP-1 system at IMSSS, and students traveled from the surrounding 
elementary- school districts to the instructional laboratory at Stanford 
to take their logic lessons. Later, the students were able to take their 
lessons on teletypes located in their schools. 
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Since its inception, both the curriculum and the program have been 
under constant modification and revision as new operational modes have 
been added. In the fall of 1969, a version of LIS was implemented on 
the PDP-10 system. In the spring of 1970, data-collection routines were 
added by the author to the EDP-10 version. LIS, as described below, is 
the current version of the program with data-collection capabilities. 

The description of LIS will proceed as follows. First, a brief 
description of the modes of problem presentation is given. These are 
multiple -choice, truth-analysis, counterexample, and derive modes. An 
example of each mode can be found in Appendix A. Next there is a 
detailed discussion of the types of input which the students are allowed 
to make and the manner in which LIS handles invalid student input. This 
is followed by a discussion of the program clocks. Finally, an outline 
of the subject matter of the LIS curricula is presented. Since this 
study is concerned primarily with student performance, it is not 
appropriate to include a detailed description of the organization Mid 
logic of the operating program. 

The multiple-choice mode needs little explanation. Students are 
presented with a small body of text. The text is usually an explanation 
of a concept followed by a question, or else it is a question on sane 
previously explained material. Then two or three lettered responses are 
presented, and the student is required to type in the letter corresponding 
to the correct response. If he types in the correct response, the com- 
puter types correct and presents the next problem. If he types an 
incorrect response, the computer types wrong, try again. This continues 
until the student enters a correct response. 




* 



In the truth-analysis mode, the student is required to compute the 
truth value of a formula. In one form of truth analysis, tne machine 
assigns the value T or the value F to each sentential variable 
occurring in the formula and then presents the student with each 
subformula. The student types the truth value for each of these 
subformulas. After he has assigned values to all subformulas, he is 
presented with the whole formula, and he must type in its truth value. 

If his answer is correct, he receives the next problem. If it is not, 
he must repeat the problem. 

In the other form of truth- analysis problem the student is given 
the truth value of the conclusions. His task is to assign truth values 
to the sentence letters such that the conclusion takes on its given 
value. As in the other type, the problem is repeated until the student 

makes the correct truth assignments. 

The counterexample mode is similar to the truth-analysis mode. 

The student is presented with a formula and zero or more premises and 
asked to make truth assignments such that the premises are true and the 
conclusion false. He is presented with each variable, and he assigns a 
truth value to it. Using his assignments, he computes the truth values 
of each subformula and then of each premise. If any premise is found to 
be false, he is required to restart the problem. If the premises are 
true, he is presented with the conclusion and asked to compute its 
truth value. If the conclusion is false, the computer types correct, 
and he is presented with the next problem. If the conclusion is found 
to be true, he must restart the problem. 

In the derive mode, the student is required to construct a 
derivation. For this purpose, he has at his disposal a large number of 
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rules of inference, axioms, and, eventually, theorems. A list of these 
rules can be found in Appendix B. (For the sake of brevity, we use the 
term 'rule' to denote 'rule of inference,' 'axiom,' and 'theorem' for the 
remainder of this dissertation.) The student is permitted to type any 
rule which is logically valid at any step in a derivation. The rule need 
not, in any sense, bring the student closer to the desired conclusion. 
Thus, as long as the student continues to enter logically valid rules, 
he is free to use any line of reasoning that he wishes. At present, 
there is a 32 -line limit on the length of a derivation, but for the 
problems considered here this restriction is inconsequential. 

Except for the rules IP, FIN, and DLL, each rule has the form 

n r n 2 X l X 2 n 3 , where n l 5 n 2 5 and n 3 are either inte S ers or nall > X i is a 
letter of the alphabet, and is a letter or null. To provide an 

illustration of the way in which rules are used, we have included 

Appendix C. It contains two typical derivation problems. The first 

example is from sentential logic and contains an instance of the rule 

IP. The second example is a typical algebra problem. 

The rule DLL (delete last line) allows the student to "erase" his 
last line. When the student types DLL, the computer deletes all of its 
internal references to the line previously entered by the student. The 
next line entered by the student is given the same number as the last 
line deleted. The student is permitted to delete, sequentially, any 
line that he has entered. 

If a student attempts to enter a rule which is not logically valid 
or to enter a nonexistent rule or an improper rule format, he is given 
an error message. These are one- or two-line messages typed to the 
student which explain the nature, of his mistake. Some typical error 



messages are included in Example 2 of Appendix C. 

A fifth type of problem presentation asks the student to find either 
a derivation or a counterexample (problem 505 .25? Appendix A). The 
student must decide whether the formula presented is true or false. 

If he decides that a counterexample exists, he type CEX and the machine 
enters the counterexample mode; otherwise, he types DER, and the machine 
enters the derive mode. In either case, the computer does not evaluate 
his choice. That is, if he types CEX and a counterexample does not exist, 
he is still permitted to try to find one, and vice versa. 

There are three clocks in LIS which are relevant to this discussion. 
These clocks may be thought of as alarm clocks. They are set by the 
program to "ring” or "fire" after some specific duration. When a clock 
fires, it signals the program to initiate some particular action. 

Some problems contain hints which are stored with the problem in 
the problem file. If a student desires help, he may type H. A hint is 
available only if one has been written for the problem and the hint 
clock has fired. If a hint exists for a problem, but the clock has not 
fired when a student types H, he is told to wait a little longer. If 
there is no hint for a problem and the student asks for help, he is 
told that no hint is available. The hint clock is set to fire 0.5 
minutes after the beginning of a problem and after each response. 

The problem clock is set to fire two minutes after the last 
student input. If the student inputs any character prior to this time, 
the problem clock is reset. If the clock fires, the student is auto- 
matically signed off the terminal and his session is teiminated. 

The session clock is set wh'en the student signs on. It fires 
fifty minutes later. The student is then signed off at the completion 



of his current problem, although the student may sign himself off, at 
any time, by typing FIN. He is, of course, free to sign back on again 
at any time, and then his session clock is reset to fifty minutes. 

The logic curriculum is arranged by lesson. Each lesson contains 
a different number of problems and is designed to teach one or more 
concepts. There are five series of lessons. The 100 and 200 series 
lessons were designed for elementary and junior high school students. 

The 400, 500 and 600 series lessons were designed primarily for college 
students . 

The 400 and 500 series lessons concentrate on the axioms for an 
ordered field. The student begins with a review of sentential logic. 

He is then given a set of axioms for addition of numbers that includes 
commutativity, associativity, and the properties of zero and negative 
numbers. Using the axioms and rules of inference, he derives a number 
of theorems on the addition of numbers. After a theorem has been 
proved by a student, it becomes available to him for use in later proofs. 
Following the section on addition, a similar treatment is given to 
multiplication and fractions. The student next studies some properties 
of the ordering relation "less than." The final section gives the same 
axiomatic treatment to the Boolean or class algebra. 

The 500 series concentrates on the review of sentential logic. 

This series was implemented primarily to give the student practice in 
presenting counterexamples to unsound arguments. It is the only series 
of the co ll ege curriculum in which counterexample-mode problems can be 
found. 

Finally, the 600 series was added in the fall of 1970 for use in 
Philosophy 3, The Logic of Political Argument. It was designed to 
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adhere in structure as closely as possible to the 500 series. Only the 
semantic content of seme of the problems was changed. 

This study is concerned with examining the relationship between 
the structural properties of logic problems and problem difficulty, 
expressed as a function of student performance. In earlier studies of 
this nature oh elementary mathematics curricula, the proportion of 
students who successfully completed a problem was used as a measure of 
difficulty. In these earlier studies, the problems were such that the 
correctness of a single response was a good indication of whether the 
student had successfully performed the task required. In logic, the 
"correct answer" or the derived expression is not the object of interest. 
The student must present evidence that he has constructed a valid argu- 
ment. The evidence takes the form of a valid derivation using the 
rules of LIS. Further, the student is not permitted to advance to the 
next problem until he has successfully completed his current one. Thus, 
it would not be useful or meaningful to use proportion correct as an 
indicator of problem difficulty. I had to look for other, less obvious, 
measures of problem difficulty. 

In the search for a measure of difficulty I was constrained to 
quantities measurable by our system. Since this was an investigation 
of a college curriculum under actual teaching conditions, it was desir- 
able to make the data collection invisable to the student. Thus, the 
data available were the characters which the student typed to the 
system, the times at which these characters were entered and the system's 
response to the student. In the ensuing paragraphs I consider seme of 
the alternative measures of difficulty, definable in terms of the 
information at our disposal. 
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First, the mean number of error messages per problem can be used 
as a measure of problem difficulty. However, there are several possible 
explanations why a student may enter a response which generates an error 
message. First, error messages may occur as a result of typing errors, 
such as a strdent accidentally hitting the wrong key. Second, a student 
may know which rule he needs to proceed but he may be unsure of how to 
enter it in LIS. Or third, he may, in fact, have a faulty understanding 
of a rule. To gain a more complete understanding of the reasons behind 
behavior which results in error messages would require a far deeper 
analysis of error messages than is planned for this study. It is also 
relevant to note that a student may be unable to do a problem and yet 
generate no error messages. He can do this either by having no input 
at all or by inputting rules which he knows, but which are irrelevant 
to a correct derivation. However, the relationship of this measure with 
the other measures defined below was examined. This measure will be 
referred to as variable B5, in order to remain consistent with order 
in which the variables were listed by the data reduction programs. 

Next, consider the number of lines in the derivation— that is, 
the number of correctly entered rules for a valid derivation. The 
measure of difficulty is defined as the mean number of lines per proof 
per problem and referred to as varible Bl. This, criterion of difficulty 
has two serious drawbacks. First, a proof for a problem may be very 
short, yet the problem is considered, intuitively, difficult. Problems 
which require "tricks" or unusual approaches fall into this category. 
Second, problems which require a large number of lines are sometimes 
considered intuitively easy. These are problems which require 
straightforward applications of familiar rules. 
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Third, consider the elapsed time from the start of a problem to its 
solution. Define as a measure of difficulty the mean latency to comple- 
tion and denote it by B2, More precisely, the latency is the sum of the 
latencies for each valid line entered by the student. (See Appendix D 
for a more detailed description.) Unfortunately, one of the objections 
stated in the previous paragraph may be applied to this measure also. 
Latency is an increasing function of the number of lines in a proof. 

Thus, "easy" problems which require many lines will have large latencies. 
As a result, I was not able to distinguish between short, "tricky" 
problems and longer, straightforward oneso 

It seems more reasonable to believe that problem difficulty is 
some function of problem length and latency. Thus, a fourth possibility 
is the mean latency per line. This quantity is defined in two ways. 
Variable B3 is defined as 



where LI. is the number of valid lines entered by student i, T. is 



total latency to solution for student i and N is the number of 
students solving the problem. Variable B4 is defined as 



where L2. is (LI - 2*DLL. ), DLL. is the number of occurrences of the 
rule DLL in the proof of student i and T^ and N are as above. Both 
of these measures are free from the objections mentioned above and agree 
with one's intuitive feelings of problem difficulty. 
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It is shown in Chapter IV that variables B3 and B4 are highly- 
correlated (.99). Note that B3 includes the false starts or irrelevant 
paths which the student has decided to "erase" from his proof by the use 
of DLL. B4, on the other hand, includes only those lines which the 
student has decided will comprise his actual proof. Because of the 
close relationship between these variables, variable B4 was chosen the 
measure of difficulty in order to decrease problems of interpretation 
in the analysis. Thus, the measure of difficulty is the corrected mean 
latency per line. 

Having defined the emperical measures of problem difficulty, we 
now turn to a discussion of the variables or structural features of the 
problems, which are indicators of problem difficulty. These variables 
must be defined solely in terms of problem and/or curriculum structure 
and not as a function of the student's performance. The variables are 
divided into three distinct categories: (a) structural variables, (b) 
"standard proof" variables, and (c) sequential variables. Each category 
is discussed separately. 

Structural variables are those features of a problem which can be 
identified by visually examining the problem. These variables are 
defined solely in terms of the symbols which appear on the teletype 
prior to student input. A brief description of each follows. 

1. The number of words in the problem. This is essentially 
a measure of the amount of information to be processed 
by the student. Symbolic logical connectives (V,&,-i,->), 
arithmetic operators, sentence letters, algebraic 
variables, numerals, and parentheses are considered as 
one word each. In studies on elementary- school 

15 

o 

ERIC 



20 



mathematics (Suppes, Jerman and Brian, 1968), an 
analogous variable was significant in predicting 
performance on problems. 

2. The number of symbols in the sentence to be derived. 
This variable is intended to give one measure of the 
logical complexity of the problem. The procedure for 
obtaining a value for (2) is illustrated by the fol- 
lowing example. Suppose the problem is 

DERIVE: A <(5+4)+l-> A < 5+((l+3)+l) • 

There are 23 symbols in the sentence, thus the value 
of the variable is 23. 

3. Number of occurrences of logical connectives in the 
sentence to be derived. This variable is a slightly 
different measure of the logical complexity of the 
problem. To illustrate the procedure for obtaining 
the value of (3)> consider the following simple 
example : 

DERIVE: (R&S) -4 R . 

There are two logical connectives, namely, & and -» . 
Thus, the value of the variable is 2. 

4. The depth of nesting of the most deeply parenthesized 
expression in the sentence to be derived. This 
variable is intended to reflect another aspect of 
logical complexity. The value of this variable is 
found by counting the number of left parentheses in 
each expression of the sentence to be derived and 
choosing the maximum value. If there are no 
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parentheses, the value is zero. To illustrate this 
variable, consider again the problem: 

DERIVE: A <(5+4 )+l -» A < 5+((l+3)+l) • 

We find three parenthesized expressions, namely, (5+4), 
((l+3)+l) and (l+3). (5+4)’ has one left parenthesis, 

((l+3)+l) has two left parentheses and ( 1+3) has one 
left parenthesis. The maximum value is two, thus the 
value of the variable is 2. 

5. The number of premises. This variable gives some measure 
of the amount of information a student must take into 
account and use while attempting a derivation. It seems 
reasonable to assume that, as the number or premises 
increases, difficulty will also increase. 

6. Problem context (0,l) . This variable is a reflection 
of the context in which the problem occurs. The 
variable has value one if it is a 500 series problem 
and zero otherwise. 

7. Explanatory material and/or a hint in the problem 
statement (0,l). The variable has value one if the 
problem contains explanatory material, zero otherwise. 

The "standard proof" variables have an element of subjectivity in 
their definitions which the first group does not have. They require the 
availability of a solution or proof for the problem. Since the solution 
to a logic problem is not unique, there will be some degree of arbitrari- 
ness in the selection of a "standard proof." For purposes of this study, 
those proofs generated by the author will be considered standard. 
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Several criteria were used by the author in generating the standard 
proofs. First, the author worked through the entire set of problems 
included in this study two times. The proofs generated the second time 
through are used as standard. An attempt was made to construct proofs 
with a minimal number of lines. Also, within the constraint of pro- 
ducing a minimal proof, an attempt was made to use rules and theorems 
most recently introduced, wherever possible. It is the judgment of the 
author that the great majority of the proofs produced are minimal in 
the sense of containing the least possible number of lines. 

It is true that from a mathematical standpoint, it might bo 
desirable to demonstrate that the proofs are minimal. However, the proofs 
are surely minimal in the majority of cases given and explicit proof 
would make very little change in the interpretation of my results. 

All but one of the "standard proof" variables are the number of 
occurrences of certain rules used in the standard proof. These rules 
are: 

8. Affirm the antecedent. (AA) 

9. Conditional proof. (CP) 

10. Indirect proof. (IP) 

11. Any axiom. 

12. Any theorem. The material included in this study 
contained only Theorems 1 through 6. 

13. The number of lines in the proof. 

The third group of variables is made up of the sequential 
variables. These variables are meant to measure the effect of position 
of the problem in the curriculum. It is reasoned that the greater the 
number of rules available to the student, the more difficulty he will 
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have in deciding which one to use. Also, the number of problems 
completed will affect performance. The following variables are an 

attempt to quantify these facts. 

The first three are simply the number of rules, theorems, and 
axioms available to the student for the problem. That is, the magnitude 
of the number of available: 

14. Rules of inference. 

15 . Theorems . 

1 6. Axioms. 

The next and final variable provides a measure of the "learning" 

for each rule. It is defined as: 

17. The number of problems since the last introduction of 

a rule. This variable gives some measure of the amount 
of practice a student has had on a particular rule. 

Table 1 lists the measures discussed in this section. Also 
included in Table 1 is a transformed variable, denoted Sl8, which is 



Insert Table 1 about here 

of importance in the analysis which follows. I have included it in 
Table 1 in order to provide the reader with a complete list of structural 
variables used. The significance of variable Sl8 is discussed in 
Chapters III and IV. 
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TABLE 1 



Behavioral and Structural Variables 



I, Measures of Problem Difficulty 

Bl. Mean number of lines per derivation 
B2. Mean latency to a correct solution 
B3. Mean latency per line 

B4, Correlated mean latency per line (difficulty) 

B5. Mean number of error messages per derivation 

II, Measures of Problem Structure 

A, Structural Variables 

51, Number of words per problem 

52, Number of symbols in sentence to be derived 

53, Number of occurences of logical connectives 
in the sentence to be derived 

54, Depth of nesting of the most deeply parenthesized 
expression in the sentehce to be derived 

55, Number of premises 

56, Problem context 

57, Inclusion of explanatory material and/or hint 
of the problem statement 

B, Standard Proof Variables 

58, Number of occurrences of affirm the antecedent (AA) 

59, Number of occurrences of conditional proof (CP) 

510, Number of occurrences of indirect proof (IP) 

511, Number of occurrences of any axiom 

512, Number of occurrences of any theorem 

513, Number of lines in the proof 

C, Sequential Variables 

514, Number of rules of inference available 

515, Number of theorems available 

516, Number of axioms available 

517, Number of problems since the last introduction 

of a rule 

D, Transformed Structural Variable 

518, S5 cubed 
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CHAPTER III 



DESCRIPTION OF THE STUDY AND THE MODELS 

In this chapter I discuss the population utilized, outline the 
method of data collection, describe certain characteristics of the 
collected data, and outline the methods of analysis. The primary 
objective of this analysis is to describe in a precise way the 
relationship between the structural and behavior measures of difficulty 
and to develop models which will enable us to predict student performance 
from the structural features of the problems. A secondary objective is 
to provide some general descriptive information about student performance 
on the LIS. 

The population used in this study consisted of the 27 Stanford 
University students who enrolled in Philosophy 157 in the summer quarter 
of 1970, the period during which the data were collected. No special 
procedures, other than normal departmental prerequisites, were used in 
the selection of these students. The group consisted only of students 
who had decided to take the course. 

The curriculum under investigation consisted of 205 problems from 
the computer-based segment of the course. Although the number of 
students involved in the study is not large, a considerable quantity of 
information has been collected for each student. Thus, I feel that an 
ample amount of information is available to successfully accomplish the 
objectives of this study, even though its generalizability to all 
student populations is limited. 
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The students were proctered during their sessions at the terminals 
by the philosophy graduate students who gave the lecture- portion of the 
course. They received three hours of traditional classroom instruction 
per week, in addition to the time which they spent at the computer 
terminal. Also, there was always someone available who was familiar with 
the computer system and the logic program and who was able to deal with 
any operational difficulties. 

The logic data collection routines were added to the LIS in the 
spring and early summer of 1970. They were designed and programmed by 
the author. When the logic program was converted from the PDP-1 to the 
PDP-10, no provisions for data collection were made. Thus, it was 
necessary to modify certain sections of an already existing program. 

These modifications required several steps. A special data 
collection routine had to be written in assembly language and interfaced 
with the logic program. It was decided to store the raw data on disk 
files during the day and then to transfer each day's data to magnetic 
tape, where it was kept for later reduction and analysis The necessary 
programs were written and debugged in the spring of 1970. 

During the time that the data were being collected, some data were 
lost. As a result of long-term experience with the system (two years), 

I feel justified in stating that data loss was in no way systematic. 
However, to support this opinion rigorously would require a much more 
definitive analysis of the system than is presently available, and I 
feel that it would be neither feasible nor appropriate to include a 
detailed analysis of the system in this study. 

During the summer of 1970, while the data were being collected, a 
second series of programs were written by the author. They were designed 
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to convert the raw data into a form acceptable by the standard Biomedical 
Programs (BMD) used in the final stage of the analysis (Dixon, 1970) • 
These intermediate programs are described in detail in Appendix D. 

The results presented in this study were obtained by means of two 
BMD programs. First, the overall means, standard deviations and corre- 
lations of all of the variables described in Chapter II were computed. 

For this purpose, I modified the BMD06M Canonical Analysis Program to 



run on the IMSSS PDP-10 system (see Appendix D). An outline of the 

computational procedure used may be found in the BMD Manual, pp. 207-213. 

These results are discussed in Chapter IV. 

The next step in the analysis was to describe formally the nature 
and degree of the relationship between the behavioral and structural 
variables. To do this, the canonical correlations and canonical co- 
efficients were computed by means of BMD06M. Although canonical 
analysis is a well-known procedure, an outline of the model is provided 
to avoid any ambiguity in terminology. The development follows that of 
Morrison ( 1967 )* 

Consider the two sets of variates: the behavioral variables and the 
structural variables. Assume that the first set has p variates and the 

second set has q variates. Suppose that the p + q variates are from some 

multidimensional population which has been partitioned such that: 



£ll ~12 \ 




~12 &2 



It is assumed that: 



1. The elements of E are finite 
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2. E is of full-rank p + q. 

3. The first r < min (p,q) characteristic roots of 



£ 11 £12 ~ 22 £l2 



are distinct. 

From this population, N-observation vectors have been randomly 
drawn and the sample has been partitioned such that: 



x' = (x x > Xg) £ = 



£ll £l2 



£l2 £22 



in conformance with the above. 



We wish to determine the linear 


compounds 


u, = a ' x. 


v, = b 


1 —1 ~1 


1 — 


U 2 : £2 £1 
• 


n • • 


• 

u 5 = a 5 £l 


• 

v 5 =b 



such that the sample correlation of and v^^ is greatest, the sample 

correlation of u^ and Vg is greatest among all linear compounds 
uncorrelated with u^ and v^ , and so on for all s = min (p,q) 
possible pairs. 

To do this, solve for \ in 

- “ill - 0 

Order the roots from largest to smallest C^, Cg,...C g • These are the 
canonical correlations. The coefficients are obtained from the equations 



(S,J3"rS,' - C.S, , )a. 

k ~-±2~-22~~12 —1 11 ~1 






where a.^ and b^ are chosen to satisfy 

2b 
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a. = 1 and Id 'S^ b = 1 

The final stage of the analysis was to fit a regression model in 
order to predict problem difficulty as a function of problem structure. 
Although the primary goal was to predict problem difficulty, regression 
models using B1 and B2 as the dependent variables were also considered. 
Thus, some idea of the predictive power of the structural variables with 
respect to these other behavior measures was obtained. 

The program used for the regression analysis was the BMD02R. This 
program was fully implemented on the PDP-10 by IMSSS staff in June, 1970 
and further modified by the author (see BMD Manual and Appendix D). 

Since regression analyses are also a standard statistical procedure, it 
does not seem appropriate to give a full description of the theory of 
regression analysis here. However, the model is presented for purposes 
of developing notation. 

The general multiple linear regression model can be written as: 



Y = X j3 + £ 

(n x 1) (n x p) (p X 1) (n X l) 

where 



Y= 


Y. 

1 1 ^ 
• 

• 

• 


X = 


/ X 10 ‘ ‘ 

• 

• 

• 


• x i, p-i ^ 


£ = 


e ° \ 
• 

• 

• 








X 

np 


X n, P-1 J 




p p-i 



e 




Y^ are the mean problem difficulties for i-th problem. 

X. . are the values of the p-1 structural variables 1 < j < p-1 
are the parameters to be estimated, 
e. are the errors. 

l 

X iQ = 1 for all i. 
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We can write the normal equations as: 



X'X £ = X'Y 
2 

Assume B(e)= 0 and V(e) = I a , then the least squares estimators 



B of ft are 



B = (X'X )" 1 X'Y 



Assume further that e.~N(0,Ia )> 



then 



B ~ M.V.N.(p,a 2 (X'X)' 1 ). 



The ANOVA table is shown in Table 2.. 



Insert Table 2 here 



2 

If the model is correct, MSp^g = S 
coefficient of multiple determination 



is an estimate of 
n 2 

R as : 



2 

a 



Define the 



R 2 = (BX'Y - NY^/Of'Y - NY 2 ) . 



This is usually interpreted as the proportion of variance accounted for 
by the regression. 

I shall now discuss the assumptions made in the regression analysis 
and the procedures used to check the validity of these assumptions for 
our data. First, it is assumed that the model is linear in the parameters. 
Since this was the first attempt at analysis of college student performance 
on the LIS, no information was available to use as a guide in the selec- 
tion of a nonlinear model. Thus, until definite information about the 
form of the relationship between the variables is available, a linear 
model is assumed. 
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TABLE 2 



Analysis of Variance for Stepwise 
Multiple Linear Regression 



Source 


df 


SS 


MS 


Regression 


P-1 


b’X’Y - nY 2 


(b 'X'Y - nY 2 )/p-l 


Residual 


n-p 


Y’Y - b’X’Y 


(Y'Y - b' X'Y) /n-p 


Total 


n-1 


Y’Y - nY 2 
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The other assumptions concern the distribution properties of the 
errors. If e is the vector of errors, then assume that E(e) =0 and 
V(e) = I o 2 , that is, the errors are uncorrelated and have common 
variance. An assumption that the £ &re normally distributed is not 
necessary to obtain estimates of the parameters, but it is necessary only 
in order to make tests of statistical significance. These assumptions can 
be examined by plotting the residuals. The residuals are defined as: 

If the fitted model were correct, the residuals should have exhibited 
tendencies that would seem to confirm the assumptions. My version of the 
BMD 02 R program allowed, as optional output, plots of (a) residuals versus 
computed, (b) residuals versus the independent variables, and (c) depen- 
dent variable versus the independent variables. I made all plots in order 
to determine if the assumptions appeared to be violated. Where the 
assumptions appeared to be violated, the plots were used to pinpoint the 
sources of trouble and transformations on the existing variables were 
used to correct for the violations. 

p 

In seme cases the assumption V(e) = I cr appeared to be violated, 
perhaps due to the fact that the measure of difficulty is essentially a 
latency. Again we attempted to remedy this situation by a transformation. 
Kruskal (1968) discussed a number of variance -stabilizing transformations. 
The various transformations suggested by Kruskal were considered, and I 
selected the square-root transformation as the one most useful for my 
purposes. Kruskal also stated that many authors have remarked that 
frequently (although not invariably) a single transformation also improves 
normality, as well as stabilizing variance. 

28 

33 



i 



In s umm ary, the analysis was carried out as follows: first, the 

behavioral data were reduced to a form usable by the standard statistical 
routines. In the, process, we output descriptive summaries of college 
student performance on the LIS. Next the BMD0&1 Canonical Analysis 
program was used to obtain concise measure of the relationship between 
the two sets of variables listed in Figure 1, Chapter II. Finally, using 
the intuitively best measure of difficulty— correclated mean latency per 
line— as the dependent variable, I did a stepwise multiple linear 
regression in order to develop a model which could account for problem 
difficulty as a function of problem structure. 



CHAPTER IV 



RESULTS 

In this chapter I discuss the results of the analyses which were 
described in detail in Chapter III. First, the summary statistics for 
all the variables studied are presented. Next, we discuss some 
interesting aspects of the data which do not appear in the summary tables. 
These include a brief discussion of the problems which had extreme values 
on the behavioral variables. Then we look at the correlations among the 
variables and discuss the canonical analysis. This section concludes 
with a discussion of the. regression analyses. 

Table 3 contains the mean, standard deviation and range for each 
of the behavioral variables. Table 4 contains these statistics for the 
structural variables. A brief discussion of several of the values found 
in the tables will be informative. 



Insert Tables 3 and 4 about here 

First, in Table 3 note that the means of variables B3 and B4 
differ by less than one unit and their ranges are identical. Thus, I 
have assumed that these variables are slightly different measures for the 
same underlying behavior. I have chosen to use variable B4 as the 
"measure of difficilty" for the reasons given in Chapter II (p. 15). 

A second interesting aspect of the results is the low error message rate, 
variable B5» In fact, there were 26 problems for which there was no error 
at all. This implies that the students were adept at using the rules they 
had learned. 
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Means, Standard Deviations and Ranges 
for Behavioral Variables 



Variable 


Mean 


Standard Deviation 


Low 


High 


B1 


4.55 


3.32 


1.00 


15.80 


B2 


84.18 


85.53 


3.97 


415.71 


B3 


15.77 


7.54 


3.70 


47.25 


B4 


16.43 


8.34 


3.70 


47.25 


B5 


0.34 


0.37 


0.00 


1.73 
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TABLE 4 



Means, Standard Deviations and Ranges 
for Structural Variables 



Variable 


Mean 


Standard Deviation 


Low 


High 


SI 


19-99 


20.28 


4.00 


138.00 


S 2 


11.64 


6.19 


1.00 


31.00 


S 3 


1.07 


1.81 


0.00 


9.00 


S 4 


1.03 


0.91 


0.00 


5.00 


S 5 


0.46 


0.82 


0.00 


3.00 


S6 


0.22 


0.42 


0.00 


1.00 


S 7 


0.19 


0.39 


0.00 


1.00 


S8 


0.24 


0.59 


0.00 


3.00 


S 9 


0.45 


0.68 


0.00 


3.00 


S10 


0.07 


0.25 


0.00 


3.00 


Sll 


0.25 


O.52 


0.00 


2.00 


S12 


0.09 


0.37 


0.00 


2.00 


S 13 


3.84 


2.85 


1.00 


15.00 


Sl 4 


15.13 


3.90 


5.00 


19.00 


S 15 


0.37 


1.14 


0.00 


6.00 


si6 


1.28 


1.91 


0.00 


5.00 


SIT 


5.44 


4.88 


0.00 


23.00 
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In comparing Bl, Table 3, with S13, Table 4, we see that the 
students have been very successful in producing minimal proofs. However, 
this must be considered in light of the fact that there were 51 problems 
which could be solved by a one-line proof, a fact which was reflected in 
the behavioral data where there were a total of b'J problems for which the 
Bl value was less than 2.00. Further, the variance of Bl for some of the 
longer problems Is quite large indicating that fewer students produced a 

minimal proof for these problems. 

Tables 5 and 6 contain the problem statements and the standard 
proofs for the seven problems having extreme values on the behavioral 
measures. For the low values on variables Bl and B5> the problems were 
chosen arbitrarily from those with the appropriate magnitude. Tables 5 
and 6 provide insight into the features of the problems and curriculum 
which give rise to extreme values on the behavioral measures. A 
familiarity with these logic problems will add meaning to the discussion 
of the analysis presented below. A brief explanation of each of these 
problems is given followed by a discussion of the relationships among 
the variables for these problems. Readers unfamiliar with the rules of 
LIS may refer to Appendix B. 

Problem 415032 received a value of 15.80 for Bl. It begins with a 
hint telling the student that there is a certain redundancy in the rules 
which he has available. At this point in the curriculum he has been 
given all five axioms for addition plus the first three theorems. He 
is asked to derive 6 = 3 + 3. It is possible to produce a derivation 
using the axioms and theorems, but, this will not result in the minimal 
proof. To obtain the minimal proof the student must use the rules learned 
earlier in the curriculum. 
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This problem requires l4 lines for its standard solution. In 
addition, it would be considered a difficult problem on all of the 
measures considered. It is ranked (15) on measure B2 with a latency 
of 252.91 sec., (20) on measure B5 with .9 error messages and (85) on 
measure b 4 with 16.24 sec. per line. The problem involves five appli- 
cations of the rule ND and appropriate algebraic manipulations and 
algebraic substitutions which are accomplished in this case by the rules 
AR, CE, CA, and RE. 



Insert Table 5 about here 



The problem ranked highest on measure B2 is 415010. Again this 
problem would be considered very difficult on all of the measures. It 
is ranked (5) on measure B5 with 1.57 error messages, (8) on measure b 4 
with 57.86 sec/line, and (12) on measure B1 with 11. 09 lines. In this 
problem, the student is asked to derive the conditional: if A is less 
than (5+4)+l then A is less than 5+((i+3)+l)« The student can easily 
verify that it is true since obviously (5+4)+l equals 5+((l+3)+l). One 
approach could be to show that A < 10 -» A < 10 and then show 
10=(5+4)+l=5+((l+5)+l) and substitute. However, this would require more 
than seven lines. There are, of course, several other approaches. 

The problem ranking highest on measure B5 is 414050. In this 
problem the student must derive the statement that A+(5+(-A)) equals 
1+(1+1), a statement which is obviously true. This problem is similar 
to problem 415052 except that conditional proof is not required and the 
student, at this point in the curriculum, has no theorems available to 
him. As in the two previous cases, it would be considered a difficult 
problem on the other measures also. It was (2) on measure B1 with 15.68 
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TABLE 5 



Problems Receiving Highest Value 
on Behavioral Variables 



41502 : 

THERE ARE SOME SUPERFLUITIES AMONG OUR RULES, BUT SINCE WE ARE NOT AFTER 
MATHEMATICAL ELEGANCE WE TOLERATE THEM. A SIMILAR SITUATION EXISTS IN 
THE RULES OF SENTENTIAL LOGIC . 



DERIVE: 


6 = 


3+3 


nd6 


(1) 


6 = 5 + 1 


ND5 


(2) 


5=4 + 1 


1.2RE1 


(3) 


6 = (4 + 1) + 1 


3AR2 


(*) 


6 = 4 + (1 + 1) 


ND2 


(5) 


2 = 1 + 1 


5CE1 


(6) 


1+1=2 


4.6RE1 


(7) 


6 = 4 + 2 


ND4 


(8) 


4 = 3 + 1 


7.8RE1 


(9) 


6 = (3 + l) + 2 


9AR2 


(10) 


6 = 3 + (1 + 2) 


ND3 


(n) 


3=2 + 1 


11CE1 


(12) 


2+1 = 3 


12CA1 


(13) 


1+2 = 3 


10.13RE1 






(1U) 


6=3 + 3 



CORRECT 



413.10: 

DERIVE: A < (5 + 4) + l->A<5 + ((l + 3) + l) 

WP (1) A < (5 + 4) + 1 

AS (A + B) + C = A + (B + C) 

A:5 

B:4 

C:1 (2) (5 + 4) + 1 = 5 + (^ + 1) 

■#» 
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1.2RE1 


(3) 


A < 5 + (fc + 1) 


ND4 


00 


k = 3 + 1 


3^RE1 


(5) 


A < 5 + ((3 + l) + 1) 


5CA2 


(6) 


A < 5 + ((1 + 3) + 1) 


1.6CP 


(7) 


A< (5 + k) + 1-»A< 5 + ((1 + 3) + 1) 


CORRECT 






*t05 *23: 






DERIVE: 


A = C 




P 


(1) 


A+B = A+ C-»B = C 


P 


(2) 


B = C A = C 


P 


(3) 


A + B = A + C 


1.3AA 


(4) 


B = C 


3.^AA 


(5) 


A = C 


CORRECT 






U 15 . JO: 






DERIVE: 


A + (3 + 


( -A) ) - 1 + (1 + 1) 


AI A + (-A) • ■ 0 




A: A 


(i) 


A + (-A) =0 


LAE 

:3 


(2) 


(A + (-A)) + 3 = 0+3 


2AR2 


(3) 


A + ((-A) + 3) = 0 + 3 


3CA2 


( L ) 


A + (3 + (-A)) =0+3 


Z A + 0 


= A 




A:3 


(?) 


3 +0 = 3 


5CA1 


(6) 


0+3 = 3 


.6rei 


(7) 


A + (3 + (-A)) = 3 


MP3 


(8) 


3 = 2 + 1 


7 . 8RE2 


(9) 


A + (3 + (-A)) =2 + 1 


ND2 


( 10 ) 


2 = 1 + 1 


9.10RE1 


( 11 ) 


A + (3 + (-A)) = (1 + 1) + 1 


11AE& 


(12) 


A + (3 + (-A)) = 1 + (1 + 1) 



CORRECT 



lines, (5) on measure B2 with 303.17 sec. and (65) on measure B4 with 
18.20 sec. per line. This problem also requires the derivation of a 
complex formula and involves the use of two axioms, AI and Z, as well 
as some complicated algebraic manipulations and substitutions. 

Problem 405023, which ranked highest on measures b 4 and B3, differs 
from the previous problems in two interesting ways. First, it does not 
rank very high on the other measures. For B2, it is (65) with a latency 
of 94.50. For B5, it is (117) with .07 error messages. For Bl, it is 
(151) with 2.00 lines. This problem requires only two applications of 
rule AA for its solution and, thus, does not seem intuitively difficult. 
However, one might explain its observed difficulty by the fact that it 
was preceded by 19 multiple -choice problems. This problem offers a 
dramatic illustration of the effects of surrounding context on student 
performance on a particular problem. 

Insert Table 6 about here 

Table 6 contains those problems which received the lowest values 
on the behavioral measures. These problems have several features in 
common and thus they are discussed as a group. First, they are all 
problems which require only one line for their solution. Second, each 
problem would be rated as "easy" on all of the behavioral measures. 

Third, each problem has a value of zero on measure B5 . For problems 
412023 and 414004 the student is told exactly what he must type in 
order to obtain the solution. The slightly higher than minimal latencies 
for these problems are probably due to the time required for the student 
to read the accompanying text. Problem 414005 is of precisely the same 



TABLE 6 



Problems Receiving Lowest Values 
on Behavioral Variables 



412.23: 

THERE IS A SHORT FORM OF CA SIMILAR IN SOME RESPECTS TO USES OF RE. 
IN ORDER TO DERIVE A + B = 3 + 6 FROM THE PREMISE A + B = 6 + 3 
SIMPLY TYPE ' 1CA2 ' . 



DERIVE: 


A + B 


= 3 + 6 


P 


(1) 


A + B = 6 


1CA2 


(2) 


A + B = 3 


CORRECT 






4l4.4: 






TO USE 


THE Z AXIOM 



1) TYPE 'z' AMD SPACE 

2) AFTER THE COMPUTER TYPES IN THE AXIOM AMD *A: * TYPE 
THE TERM YOU WANT TO REPLACE 'A' . 

DERIVE: 5+0=5 

Z A + 0 = A 

A:5 (1) 5 + 0 = 5 

CORRECT 

4l4.5: 

DERIVE: 17 + 0 = 17 

Z A + 0 = A 

A: 17 (1) 17 + 0 = 17 

CORRECT 



type as the preceding problem 4l4004, which introduces the Z axiom. The 
only difference is that the student must type 17 instead of 5 for the 
substitution into the axiom. 

Tables 7, 8 and 9 contain the correlations of the behavioral 
variables with one another, the structural variables with one another 
and the behavioral variables with the structural variables, respectively. 
These correlations were obtained as part of the standard output of the 
BMD06M program. The results have been separated into three tables for 

ease of examination and discussion. 

In Table 7, we find several interesting correlations which give 
some insight into the nature of the relationships among the various 



Insert Table 7 about here 

measures of difficulty. First, observe that B1 is highly correlated with 
B2 and B5 but not with B3 and B4. It is not surprising that latency and 
error rate increase with the length of proof. However, it is reassuring 
to see the correlation of B1 with B3 and B4 is not high, indicating that 
our measure of difficulty is not a simple function of problem length. 

The correlations between B2 and, B3 and B4, are seen to be somewhat 
higher. The almost perfect correlation of B3 and B4 provides further 
evidence that they are measuring the same underlying behavior and 
further justification for the decision to choose only one of them as 
the difficulty measure (B4). One final observation is that all of the 
correlations among our behavioral measures are positive and equal to or 
greater than . 37 . 

Table 8 contains the correlations of the structural variables with 
one another. Since these variables are defined solely in terms of 
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TABLE 7 



Correlations of Behavioral Variable 
B1 B2 B3 B4- B5 



B1 


1.00 0.90 


0.37 


0.4-0 


0.77 


B2 


1.00 


0.64 


0.68 


0.88 


B3 




1.00 


0.99 


0.57 


b4 






1.00 


0.6l 


B5 








1.00 
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curriculum structure, an examination of their correlations will provide 



Insert Table 8 about here 

some insight into certain features of the curriculum. The majority of 
the correlations are low; only 22 of the 185 correlations have an absolute 
value greater than .40. The large number of low correlations is desirable 
because an attempt was made to define the variables so that they reflect 
nonredundant features of the curriculum. Since it is impractical to 
discuss each of the 185 correlations, only those variables which appear 
to be of most interest are discussed. 

In examining the correlations, we are able to distinguish two 
patterns. First, a number of correlations are indicative of the 500 
lessons. It should be recalled that these lessons deal with sentential 
logic. This is reflected in the high correlations between S6 and S3, 

S4, S9, S10, Slh-. For example, the high positive correlation between 
S3 and S6 indicates that a greater number of logical connectives are 
found in problems on logic than in problems on algebra. The correlation 
between S6 and S4 indicates more nesting of parentheses in the first part 
of the curriculum and the correlation between S6 and S9 and S6 and SIO 
suggest more frequent use of conditional proof (CP) and proof by 
contradiction (IP). The high correlation between S6 and Sl4 reflects 
the fact that most of the rules become available in the first part of 
the curriculum. This is further supported by the high positive correla- 
tions between S3 and S9, and S3 and S10. There is also evidence that 
the proofs are longer in the first part of the curriculum than in the 
second part because of the correlation between S3 and S13 and a correla- 
tion of 0.30 between S6 and S13. 
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1.00 0.23 



The second pattern consists of those high correlations which arise 
as a result of the manner in which the variables are defined. For example, 
there is a positive correlation between Sll and Sl 6 because an axiom 
cannot occur in a proof before the number of axioms available become 
greater than zero. Similar explanations, based on the definition of 
the variables, can be given for the correlations between S12 and S15, 

S12 and Sl 6 , Sl4 and Sl 6 , S15 and Sl 6 , SI and S7, S2 and S4 and S3 and S4. 

Finally, two other correlations which appear in the analysis are 
worth mentioning. First, there is a correlation of .48 between S5 and 
S 8 . It appears that problems which use several occurrences of AA have 
the greatest number of premises. An example of such a problem is problem 
405023 in Table 5. Second, the high correlation between S 9 and S13 
indicates that problems requiring conditional proof tend to be longer 
than those not requiring the use of this rule. 

The discussion now turns to an examination of the relationship 
between the two sets of variables. The correlation between the 
behavioral and structural variables can be found in Table 9 . Next, 
the relationship is described more formally by means of the canonical 
correlation analysis. Finally, the predictive models obtained from the 



Insert Table 9 about here 



regression analyses are presented, first the models which have variables 
B1 and B2 as the dependent variable and then in more detail, the model in 
jw.blcJi. difficulty ..XjvarJLabl.e.-Bk)_ is the dependent variable. 

The correlations found in Table 9 between the two sets of variables 
are rather low and in the majority of cases almost zero. The largest 
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TABLE 9 



Correlations Between Behavioral 
and Structural Variables 
B2 B3 B4 B5 B6 



SI 


-0.06 


-0.03 


S2 


0.21 


0.16 


S3 


0.44 


0.37 


S4 


0.34 


0.37 


S5 


-0.l6 


-0.08 


S6 


0.36 


0.34 


S7 


-0.19 


-0.13 


S8 


0.17 


0.11 


S9 


0.44 


0.34 


S10 


0.33 


0.31 


Sll 


0.06 


0.08 


S12 


0.05 


0.05 


S13 


0.93 


0.74 


Sl4 


-0.13 


-0.10 


S15 


0.09 


0.08 


Sl6 


0.08 


0.08 


S17 


0.19 


0.10 



0.05 


0.05 


0.01 


-0.03 


-0.03 


0.10 


0.17 


0.18 


0.25 


0.11 


0.13 


0.25 


0.24 


0.27 


-0.09 


0.19 


0.23 


0.27 


0.07 


0.07 


-0.11 


0.06 


0.05 


0.05 


0.09 


0.11 


0.25 


0.23 


0.27 


0.30 


-0.03 


-0.02 


0.15 


0.05 


0.04 


0.04 


0.27 


0.27 


o.6o 


-0.07 


0.08 


-o.o6 


0.09 


0.07 


0.07 


0.03 


0.01 


0.09 


-0.03 


-0.03 


0.05 



44 



correlations are those between S13 and the behavioral variables Bl, B2 
and B5. Also there are high correlations between the minimal number of 
lines in a proof and the actual length, latency and number of error 
messages for the proof. 

Variables S3, S6 and S9 are also highly correlated with Bl, B2 and 
B5. However, as is evident from Table 4, these structural variables are 
also very highly correlated with each other and it is not easy to 
interpret their effect on the behavioral variables from Table 9 alone. 
Variable S10 also appears to be important. This variable is discussed 
in more detail later. 

The structural variables most highly correlated with the difficulty 
variable B4 are S5> S10 and S13, all .27* From Table 4, it can be seen 
that these structural variables are not highly correlated with each other. 
They play an important role in the regression model discussed below. 

Note that most of the remaining structural variables have almost zero 
correlations with B4. Thus, we are led to consider models which involve 
linear combinations of the variables. 

Table 10 contains the results of the canonical analysis. 



Insert Table 10 about here 

Behavioral variable B3 is omitted from the analysis for the reasons 
discussed in Chapter II and above. Thus, there were four canonical 
correlations and four sets of coefficients for the canonical variates. 
Since I am interested only in describing the dependencies among the 
variables and do not intend to' use the derived variates for later 

■v 

analyses, I have not explicitly computed the canonical variates from 
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TABLE 10 



Canonical Correlations and Coefficients 



Canonical Correlation = 0.94682261 



Coefficients for the first set of variables: 



-1.326259(B1) 



0.284225 (B2) 



0.021786(B4) 



Coefficients for the second set of variables: 



0. 07385 6(S1) 
0.047831(S5) 
0.005036(S9) 
-0. 935880 (S13) 



-0.146555 (S2) 
-0.269110(S6) 
-0.03615 6(S 10) 
-0. 048610 (S 14) 



0.0062 10 (S17) 

Canonical Correlation = 0.52323435 



0.123607(S3) 
0.009731(S7) 
0.050815 (S 11) 
0.007074 (S 15) 



the first set of variables: 
0.224815 (B2) -1. 261107 (B4) 

the second set of variables: 



Coefficients for 
-0.089763(B1) 

Coefficients for 

0.1017 60 (Bl) 
-0.157035(S5) 
-0.030440(S9) 
0.093578(S13) 
0.048549(S17) 



-0.29095 6(S2) 
-0.951645 (S 6) 
-0.313280(310) 
-0.32234 3 (Sl4) 



0.164078 (S3) 
-0.321546(S7) 
0. 029467 (S 11) 
-0. 196313 (S 15) 



Canonical Correlation = 0.37973930 



Coefficients for the first set of variables: 
1.8444 34 (Bl) -1. 473808 (B2) 0. 818170 (B4) 

Coefficients for the second set of variables: 



-0.193068(S1) 
0.051367(S5) 
-0.011327(S9) 
-0. 177405 (S13) 
0.401779(S17) 



0. 060709 (S2) 
-0.928428(s 6) 
-0.387504 (S10) 
-0.464o47(Sl4) 



1.114298(S3) 
0.366820(S7) 
-0. 688379 (S 11) 

-0. 006174 (s 15) 
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0.1070l6(B5) 



0. 057187 (S4) 
-0. 018433 (S8) 
-0.0356l4(S12) 
-0.084335 (Sl6) 



0. 3354 63 (B5) 



-0.096213(S4) 

0.324336(S8) 

-0.020757(S12) 

-0.1393l8(Sl6) 



-1.291125 (B5) 

-0.114576(b4) 
0. 176483 (S8) 
-0.125276(S12) 
0. 685214 (S 16) 



Canonical Correlation = 0.229l4l62 



Coefficients for 

1.913588(B1) 

Coefficients for 

0.518413(81) 

-0.054340(S5) 

0.500092(S9) 

0.039960(S13) 

0.356688(S17) 



the first set of variables: 
-4.0013l6(B2) 0.96l28l(B4) 

the second set of variables: 



-0.550062(S2) 
1.05 8815 (S6) 
-0.46l083(S10) 
0.477343(Sl4) 



-1.176869(S3) 
-0.1205 65 (S7) 
0.125329(S11) 
0. 60335 l(S 15) 



1.6515l8(B5) 

0.189711(84) 

0. 280015 (S8) 

-0.3l6266(S12) 

-0.321938(Sl6) 
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the coefficients. In the table, the canonical correlation is followed 
first by the set of coefficients for the behavioral variables, namely, 

Bl, B2, B4 and B5, and then the coefficients for the structural variables, 
SI through S17. In interpreting the coefficients in Table 6, one must 
remember that the canonical correlations were obtained from the 
covariance matrix. Thus, the magnitude of the coefficients depends 
on the magnitude of the variables considered. To illustrate what this 
means, consider variables B2 and B5 and their respective coefficients 
for the canonical correlation 0.52. From Table 3? we see that the mean 
for B2 is 84.18 and the mean for B5 is the coefficients are .22 and 

.54 for B2 and B5, respectively. Thus, on the average, B2 contributes 
18.52 units to the canonical variate whereas B5 contributes only 0.77. 
Ignoring the magnitudes of the variables, one would say that variable 
B5 plays the more important role due to the larger magnitude of its 
coefficient but when the magnitudes of the contribution are considered, 
it is B2 which makes, by far, the larger contribution to the canonical 
variate. 

For the maximum canonical correlation -95? the canonical variate 
for the behavioral variables places the most weight on Bl and B2. The 
canonical variate for the structural variables places the most weight 
on SI, S2 and S13. Essentially, the first variate is seme measure of 
the length of a problem, that is, a linear combination of number of lines 
and latency. Similarly, its correlative in the concomitant variables is 
a structural measure of length, where SI and S2 are measures of the amount 
of information to be processed and S13 is the minimal length of a proof. 
Thus, the first correlation establishes a link between the behavioral 
measures of length of a problem and their structural counterparts. 
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The magnitude of the correlation indicates that the relationship between 
these variables is a very strong one. 

The second canonical correlation 0.52 appears to place the greatest 
weight on variables B2 and B4 for the behavioral variate and on variables 
SI, S2, and Slk for the structural variate. This case yields, primarily, 
a comparison of "difficulty" expressed as a weighted sum of B2 and 
with "structural complexity" expressed as a weighted sum of SI and S2, 
information to be processed, and Sl4, availability of rules. The 
variable Sl4 appears to make the greatest contribution to the structural 
canonical variate. 

The final two canonical correlations are rather low and, thus, 
their corresponding derived variates are not of as much interest as 
those described above. For both of these correlations, the most 
important structural variables are SI and Sl4. In addition, for the 
O.38 correlation, variable S17 contributes heavily to the structural 
variate and for the 0.23 correlation, variable S2 is the other heavily 
weighted variable. 

The procedure used for the regression analyses is considered next. 
Using the results of the canonical correlation analysis as a guide, I 
ran three separate regression analyses in which Bl, B2 and B4 were the 
dependent variables. The plots described in Chapter III, p. 28 were 
obtained as part of the output for these regressions. An examination of 
these plots reveals that variables B2 and B^ appear to violate the homo- 
scedasiicity assumption. After applying a square-root transformation to 
variables B2 and B^, we find that this assumption appears to be satisfied. 

For example, Figure 1 shows the plot of the residuals versus 
variable B^. One can observe a rather obvious dependence of magnitude 
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of residuals on magnitude of B4 (see dotted lines). In Figure 2, the same 
plot is shown after applying the square-root transformation to B4. Notice 
that the pattern, which was observed in Figure 1, no longer appears. 

Insert Figures 1 and 2 about here 

Several transformations were applied to some of the independent 
variables also. However, none of the transformed variables, except for 
the cube of S5, entered into the regression equations. 

The regressions were redone, this time using variables Bl, J B2 and 
yiT as the dependent variables. The results for these regressions may be 
found in Tables 11, 12, and 13 • These tables give the step at which each 

Insert Tables 11, 12, 13 about here 

2 

variable entered the regression, the value of R and R at that step, the 
increase in R 2 due to the addition of that variable, the F-value required 
for deletion and the final regression coefficient for the variable. It 
would be pointless to discuss any variable which did not contribute at 
least 1 percent to R 2 and such variables have been eliminated from the 
models. The Anova tables are given only for the actual models used. They 
contain the variables in the equation with the step that the variable entered, 
the coefficient, the standard error of the coefficient and its computed t- 
value, the multiple correlation coefficient, and the standard error of 
estimate of Y. 

Table l4 contains the results for variable Bl. Variable S13 accounts 
for 86 percent of the variation in this case. Since S13 is the 

Insert Table 14, about here 
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PLOT: RESIDUALS (Y-AXIS) VS COMPUTED Y (X-AXIS) 



6.09$ 10.690 15.285 19.879 24.473 29.067 

8.393 12.987 17.582 22.176 26.770 
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PLOT: RESIDUALS(Y-AXIS) VS COMPUTED Y (X-AXIS) 

2.582 3.151 3.720 4.289 4.858 

2.867 3.436 4.004 4.573 5.142 



- 1.44 



- 1.00 



0.57 



0.14 



0.30 



0.73 



1.16 



n <• 



l 



1 1 



1 

1 1 1 

11 1 
1 2 



1 1 



1 1 


1 


1 




1 1 


1 




31 






1 1 1 


1 


3 1 


1 


1 1 




1 1 








1 




1 




1 


1 3 1 


12 


12 1 




1 1 


1 12 1 11 








1 


11 1 11 


411 


1 1 


1 




21 112411 


1 1 


111 1 






2 3 1 


11 1 


1 


1 




1111 


1 2 


1 






1 1 1 


1 


11 111 


1 




1 




1111 




11 




111 


1 11 




1 




11 


1 




1 11 


11 11 


11 


1 1 




1 


1 1 


1 


1 




1 


1 




11 






1 


1 






1 






1 




1 




1 


1 1 







1 1 



1 1 1 
1 1 
1 1 

1 1 1 



1 

1 



1 



1 .60 



11 



1 



2.03 



1 



2.46 



1 




582 



3.151 



3.720 4.289 



4.858 5 



2.867 3.436 4.004 4.573 5.142 



Figure 2 



52 




.427 



.427 



. *s. 



TABLE 11 



Summary Table for Variable B1 



Step 

Num. 


Variable 
Ent. Rem. 


Multiple 
R R^ 


Increase 
in R^ 


F Value 
For Del. 


Last Reg. 
Coefficients 


1 


S13 


0.93150 


0.86769 


O.86769 


1318.6469 


1.16334 


2 


S12 


0.93590 


0.87591 


0.00822 


13.0515 


1.16625 


3 


S10 


0.93900 


0.88172 


0.00581 


9.7616 


0.90045 


4 


s8 


0.94080 


0.88510 


0.00338 


5.9794 


-0.28200 


5 


S15 


0.94170 


0.88680 


O.OOI69 


3.0351 


-0.23345 


6 


si6 


0.94280 


0.88887 


0.00207 


3.6043 


0.06910 


7 


s6 


0.94330 


O.8898I 


0.00094 


1.7474 


2.69823 


8 


S3 


0.94520 


0.89340 


0.00359 


6.4720 


-0.36914 


9 


S2 


O.947OO 


O.8968I 


0.00341 


6.1304 


0.07617 


10 


S5 


0.94850 


O.89965 


0.00284 


5 .4828 


0.28674 


11 


SI 


0.95000 


0.90250 


0.00285 


5.7656 


-0.01312 


12 


s4 


0.95050 


0.90307 


0.00057 


1.0777 


-0.18570 


13 


Sl4 


0.95060 


0.90364 


0.00057 


1.0770 


0.07470 


14 


S17 


0.95090 


0.90421 


0.00057 


1.0968 


-0.01728 


15 


S7 


0.95100 


0.90440 


0.00019 


0.3109 


0.17845 


l6 


Sll 


0.95100 


0.90440 


0.00000 


0.1903 


0.09323 
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TABLE 12 



Summary Table for Variable JOtti 



Step 

Num. 


Variable 
Ent. Rem. 


Multiple 
R R 


Increase 
In R 2 


F Value 
For Del. 


Last Reg. 
Coefficients 


1 


S13 


O.7869O 


0.61921 


0.61921 


326.9081 


1.17775 


2 


S10 


0.80120 


0.64192 


0.02261 


12.6375 


2 . 30444 


3 


S12 


0.81050 


0.65691 


0.01499 


8.7281 


1.70137 


4 


Sl8 


0.81860 


0.67011 


0.01320 


7.9325 


0.09758 


5 


S8 


0.82650 


0.68310 


0.01300 


8.1053 


-0.80486 


6 


S6 


0.85220 


0.69256 


0.00945 


5.9470 


4.87922 


7 


Sll 


0.83750 


0.7Q14 1 


0.00885 


5.8126 


0.62469 


8 


S2 


0.84120 


0.70762 


0.00621 


4.1587 


0.122C5 


9 


S5 


0.84510 


0.71419 


O.OO658 


4.4065 


O.88308 


10 


S3 


0.84770 


0.71860 


0.00440 


2.9721 


-0.47333 


11 


Sl4 


0.84970 


0.72199 


0.00339 


2.3257 


0.15648 


12 


SI 


0.85030 


0.72301 


0.00102 


0.7322 


-0.01469 


13 


S17 


0.85110 


0.72437 


0.00136 


0.8755 


-0.03637 


14 


S7 


0.85150 


O.72505 


0.00068 


0.5108 


0.43606 


15 


S9 


0.85170 


0.72539 


0.00034 


0.2904 


0.22169 


16 


S4 


0.85180 


0.72556 


0.00017 


0.0933 


-0.12122 


17 


si6 


0.85190 


0.72573 


0.00017 


0.0432 


0.05144 


18 


S15 


0.85190 


0.72573 


0.00000 


0.0330 


-0.04807 
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TABLE 13 




Summary Table for Variable 



Step 

Num. 


Variable 
Ent. Rem. 


Multiple 
R R 2 


Increase 
In R 2 


F Value 
For Del. 


Last Reg. 
Coefficients 


1 


S13 


0.27500 


0.07563 


0.075 63 


16.4435 


0.09061 


2 


S5 


O.369OO 


0.13616 


0.06054 


14.0246 


0,68067 


3 


S10 


0.43250 


0.18706 


0.05090 


12.4530 


0.66759 


4 


SB 


0.47630 


0.22686 


0.03981 


10.1854 


-0.30851 


5 


S6 


0.50680 


0.25685 


0.02998 


7.9536 


1.35355 


6 


Sl6 


0.54620 


0.29833 


o.o4i49 


11.5861 


0.02457 


7 


S2 


0.56010 


0.31371 


0.01537 


4 . 3769 


0.02947 


8 


S7 


0.57170 


0.32684 


0.01313 


3.7903 


0.37023 


9 


S12 


0.57690 


0.33281 


0.00597 


1.7321 


0.18215 


10 


S3 


0.58000 


0.33640 


0.00359 


I.OO83 


-0.09061 


11 


S17 


0.58230 


0.33907 


0.00267 


0.7875 


-0.01447 


12 


Sl4 


0.58480 


0.34199 


0.00292 


0.8292 


0.04600 


13 


S15 


0.58620 


0.34363 


0.00162 


0.5003 


0.06032 


14 


SI 


0.58700 


0.34457 


0.00094 


0.2493 


-0.00204 


15 


Sll 


0.58740 


0.34504 


0.00047 


0.1462 


0.06531 


16 


S4 


O.58760 


0.34527 


0.00023 


0.0553 


0.03150 


17 


S9 


O.5876O 


0.34527 


0.00000 


0.0193 


0.02255 
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TABLE l4 

ANOVA Table and Significant Variables for B1 



Analysis of Variance: 

DF 

Regression 1 

Residual 201 



Number of steps 
Multiple R 
Multiple R 2 
Std. Error of Est. 



Sum of Squares 
1935.42 
279.20 



1 

0.93 

0.87 

0.10 



Mean Square 

1935 .42 
1.39 

Computed 
T-Value 

37.00* 



Variables in Equation: (Constant = .295) 

Step 

Variable Entered Coefficient Std. Error 

S13 1 .1.11 .03 



*p < .001 



F-Ratio 

1392-39 
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number of lines in the minimal proof, one can say, with the qualifications 
mentioned on p. 33, that the students were quite successful in finding the 
minimal proof s . The remaining variables account for only an additional 
4 percent increase in R . Thus, it appears that the more interesting 
aspects of performance on the logic problems are not reflected in the 
problem length. 

Table 15 contains the results for the regression using the square 
root of total latency, Jb 2, as the dependent variable. In this case, 

Insert Table 15 about here 

the model was able to account for 68 percent of the variation in total 

2 

latency with six variables. The value for R is significantly nonzero 
at p < .01. 

The most important variable and the first to enter the equation 
is variable S13> the number of lines in the minimal proof. It is not 
surprising that the amount of time spent on a problem is very strongly 
dependent on its length. However, the other variables included in this 
model begin to give insight into some of the other factors affecting the 
time a student spends on a problem. 

The second variable to enter the equation is variable S10, the 
number of occurrences of IP in the standard proof. The increase in 
latency may be attributed to two factors. First, the rule requiring 
three arguments, is complicated to use; the error rate for problems 
requiring the use of the rule IP was, in general, higher than for other 
problems. Second, a student must spend time to discover the contradiction 
needed for the indirect proof. 
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TABLE 15 



ANOVA Table and Significant Variables for 
the Square -Root of B2 



Analysis of Variance: 





DF 


Sum of Squares 


Mean Square 


F -Ratio 


Regression 


6 


2556.52 


426.09 


73.57 


Residual 


196 


1135.17 


5.97 




Variables 


in Equation: (Constant = 2.99) 








Step 






Computed 


Variable 


Entered 


Coefficient 


Std. Error 


T-Value 


s6 


6 


1.22 


0.50 


4 . 18 ** 


S8 


5 


- 1.06 


0.34 


2 . 82 * 


S10 


2 


2.27 


0.78 


2.91* 


S12 


3 


1.63 


0.47 


3.47** 


S13 


1 


1.21 


0.07 


17.28** 


Sl8 


4 


0.15 


0.04 


3 . 75** 



Number of Steps 6 

Multiple R 0.83 

Multiple R^ 0.68 

Std. Error of Est. 2.41 



* 

p < .01 

** 

p < .001 
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The third significant variable to enter the equation is S12, the 
number of occurrences of a theorem in the minimal proof. The increase 
in latency due to the presence of theorems in a proof may be explained 
as follows. Unlike rules and axioms, there are no mnemonics for the 
theorems. If a student feels that a theorem is appropriate, he must 
first consult his theorem sheet to see if there is such a theorem and 
to find its number (e.g., TH3) • Thus, except in the improbable event 
that a student has memorized the theorem numbers, these problems require 
more time, even though they are not necessarily more difficult. 

The transformed variable Sl8, the cube of the number of premises, 
enters the equation next. This variable represents, in part, the 
information to be processed by the student before he solves the problem. 
Each additional premise greatly increases the amount of time spent on 
the problem. 

The fifth significant variable to enter the equation is S8, the 
number of occurrences of AA in the minimal proof. Note that this 
variable has a negative coefficient. This variable was also significant 
in the regression equation obtained for JOK, where it also received a 
negative coefficient. An interpretation for it is given in the 
discussion below . 

The final variable in the model for latency is S6, the problem 
context. This variable indicates that, on the average, the problems 
in the CEX portion of the curriculum require more time. 

None of the remaining variables contribute as much as 1 percent 
to R 2 , as can be seen from Table 12. Thus, they are not included in 
the model for latency. 
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Table l6 
square root of 



contains the results of the regression which used , 
latency per line, as the dependent variable. Those 



Insert Table l6 about here 

2 ... 

variables which contribute over 1 percent to R and are significantly 
nonzero were chosen for the model. With the seven variables meeting this 
criterion, the model was able to account for 33 percent of the variation. 

p 

Although this value for R is not as impressive as the values in the pre- 
vious two cases, the F-ratio of 12.735 is significant for p < .01. 

Further, an examination of the important variables in this first attempt 
to predict problem difficulty has revealed some of the important structural 
features which may be further broken down and explored in future studies 
of this nature. Some possibilities are considered in Chapter V. But 
first, the results of the present analysis are presented. 

Variable S13, the number of lines in the standard proof, is the 
first variable to enter the equation. It accounts for o percent (see 
Table 13) of the total variation. Thus, the length of a proof is an 
indicator of difficulty, but it does not assume the overwhelming 
importance which it had in the two previously discussed models. 

The second variable to enter is S5j the number of premises, and it 
accounts for an additional 6 percent of the variation. The great majority 
of problems in which premises are given are to be found in the CEX portion 
of the curriculum. Hence, this variable may also be accounting for part 
of the effect due to problem context along with the information to be 
processed. 

Variable S10, the number of occurrences of IP in the standard proof, 

which accounted for an additional 5 percent of the variation, enters the 
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TABLE 1 6 

ANOVA Table and Significant Variables for 
the Square -root of B4 



Analysis of Variance: 

DF Sum of Squares 



Regression 7 58*62 

Residual 195 126.23 



Mean Square 

8.37 

0.66 



F -Ratio 

12.74 



Variables in Equation: (Constant = 2.66) 



Variable 


Step 

Entered 


Coefficient 


Std. Error 


Computed 

T-Value 


S2 


7 


0.02 


0.01 


2.20* 


S5 


2 


0.68 


0.09 


7.56** 


S6 


5 


0.87 


0.19 


k. 6 9** 


S8 


4 


-0.40 


0.12 


3*53* 


S10 


5 


0.70 


0.26 


2.69* 


S13 


1 


0.07 


0.03 


2.33* 


Sl6 


6 


o.i4 


0.04 


3*50* 



Number of Steps 7 

Multiple R 0.56 

Multiple R 2 0.33 

Std. Error of Est. 0.8l 



* 

p *01 

** 

p .001 
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equation next. In addition to the extra time required to use this rule 
(see p. 57, a problem involving the use of IP requires a different 
kind of behavior on the part of the student than that required in a 
straight derivation problem. The results imply that this difference is 
significant and results in increased difficulty. 

The only variable to have a negative coefficient is variable S8, 
the number of occurrences of AA in the standard proof. This variable 
accounts for 4 percent of the variation. Table 9 shows that this variable 
is highly correlated with S5 5 thus making it somewhat difficult to inter- 
pret. Note further that the AA rule was used predominantly in the CEX 
portion of the curriculum and only in those problems which could not be 
solved by means of a counterexample. That is, AA appeared only in 
DERXVE-type problems. Thus, this variable might be interpreted as 
accounting for the fact that in context of the CEX portion of the 
curriculum, derive problems are easier than CEX problems. 

Variable S6, the fifth variable to enter the regression equation, 
receives the largest coefficient. This is further evidence that problems 
in the CEX portion of the curriculum were more difficult than those in the 
remainder of the curriculum. 

The sixth significant variable to enter is Sl6, the number of axioms 
available to the student. This variable gives a measure of the amount of 
information at the disposal of the student. This is the only case in 
which one of the "availability" variables (Sl4-Sl6) played a significant 
role . 

Finally, the last significant variable to enter the regression 
equation is S2, the number of words in the sentence to be derived. 




0 



This variable is another measure of the information which must be 
processed by the student. 

Seven significant variables which account for 33 percent of the 
variation in problem difficulty are identified. The first two, S2, the 
number of words in the sentence to be derived, and S3, the number of 
premises, are measures of the amount of information which must be 
processed by the student in order to solve the problem. S6 specifies 
whether a problem is included in the CEX part of the curriculum. The 
next three, S8, S10 and S13, are standard proof variables and reflect 
the nature of the required derivation. The final significant variable 
is Sl6, a measure of* the amount of information available to the student, 
in this case the number of axioms. 

In the next chapter, the results presented above are discussed. The 
discussion includes some of the implications and a possible extension of 
regression model. 
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CHAPTER V 



DISCUSSION 

The investigation described in the previous chapters was the first 
attempt to examine college student performance on LIS. In this chapter, 
we first comment upon the significant variables in the predictive 
difficulty model and define several new variables suggested by the 
results. Next we mention some of the other important results of our 
analysis and discuss the possibility of extending the regression model 
to a process or automaton model. 

For purposes of the ensuing discussion, the seven significant 
variables are categorized under four major headings. The first category 
is problem context containing variable S6. The next category contains 
variables S2 and S5» which reflect the information which the student 
must process. The third category comprises three variables, namely, 
the standard proof variables S10, S8 and S13. The final category 
provides a measure of the available information with Sl6. One may write 
the predictive model as follows: 

= . 87 s 6 + .02S2 + .68S5 - A0S8 + . 70 S 10 + .07S13 + .l4si6. 

First consider problem context. The results show, without doubt, 
that the location of a problem in the curriculum is important. If a 
problem is in the CEX portion of the curriculum it is more difficult. 

In order to explore further the effect of a problem's position in the 
curriculum, I ran two additional regression analyses. In one analysis 
the dependent variable was ,Jb% for the 45 problems in the CEX portion 
of the curriculum, in the other analysis the dependent variable was 
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for the remaining 158 problems. These analyses did not provide any 
additional information on the important features which predict problem 
difficulty. Thus, the procedure of grouping the two parts of the curri- 
culum together did not adversely affect the results or mask the effect of 
any important variable. 

It would also be of interest to determine if there is a sequential 
effect. If a sequential effect exists, the difficulty of a problem would 
be affected by the nature of the immediately preceding problem. In other 
words, if a DERIVE problem is more difficult when preceded by a CEX 
problem than when preceded by another DERIVE problem, we say there is a 
sequential effect. Define a (0,l) variable K1 which takes the value 
one if the preceding problem is of a different type and zero otherwise. 

The second category deals with the information to be processed. 

Although five variables, S1-S5, have already been defined to provide 

a measure of this aspect of the problem, only two of them, S2 and S5, 

are significant in our model. Variable S2 is the number of symbols in 

the sentence to be derived. Although this is very crude measure, the 
0 

variable is significant in predicting difficulty. A more refined 
measure of the information in the sentence to be derived would be of 
great value. However, the manner in which this information might be 
quantized is by no means obvious. As a step in the direction of 
capturing some of the information in the sentence to be derived, consider 
the following variable, N2, which retains the information prpvided by S2 
while providing additional information about the sentence. Assign 
parentheses a base value of zero, all sentence letters, variables and 

^Technically, N1 is a standard proof variable. 
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constants a base value of one, unary operators a base value of two and 
binary operators a base value of three. Then the value of a symbol is 
its base value times the depth of nesting where we define the depth of 
nesting as Sk + 1. The value of N2 is the sum of the values of all of 
the symbols in the sentence to be derived. The following example is 
provided to illustrate N2. Suppose the problem is: 

130131031313013001310310 
DERIVE: A < ( 5 + 4 ) + 1 -> A < ( 5 + ( ( 1 + 3 ) + 1 ) 

130262031313026003930620 
The number above the sentence are the base values of the symbols, the 
numbers below are the actual values. Their sum is 56, thus the value 
of N2 is 56. In future studies of this nature, more energy must be 
spent in trying to characterize the information in the sentence to be 
derived. 

The second significant variable in this category is S5, the number 
of premises. As mentioned previously, since premises occur chiefly in 
the CEX portion of the curriculum, this variable may reflect, in part, 
the effect of problem context. In any case, the occurrence of premises 
in a problem does result in a considerable increase in difficulty and 
some of this increase is certainly due to the additional amount of 
information to be processed. Since the results indicate that premises 
are important, it would be of value to try to obtain a deeper under- 
standing of the effect of premises. To do this we propose two new 
variables, N3 and N^. If there are no premises, N3 and are zero. 
Before proceeding, one must distinguish between relevant and irrelevant 
premises. An irrelevant premise is one which is not used in the solution 
of the problem. With this distinction in mind, define N3 as the sum of 
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the N2 -values of each of the relevant premises and N4 as the sum of the 
N2-values of each of the irrelevant premises. These two new variables 
determine the effect of relevant and irrelevant premises on difficulty. 

They also provide a measure of the complexity of the premises. 

The third category contains variables relfecting the nature of the 
required derivation, namely, the standard proof variables. In the model 
the three significant standard proof variables are S8, S10 and S13. The 
most important variable throughout the analysis has been S13, the number of 
lines in the standard proof. The other two significant standard proof vari- 
ables involve the number of occurrences of specific rules in the standard 
proof, namely, AA and IP. Thus, one is led to consider trying other vari- 
ables which reflect the nature of the required rules in a derivation, without 
going to the obviously impractical extreme of a separate variable for each 
rule. Define N5 as the number of different rules used in the standard deri- 
vation. Second, define N6n as the number of rules in the standard proof 
which require n arguments. This variable was suggested by the importance 
of variable S10. 

The final category contains one significant variable, Sl6, the number 
of axioms available to the student. This variable provides some measure 
of the amount of information which the student has available to solve the 
problems. This variable brings to mind another issue, namely, the effect 
that "learning" a rule has on difficulty. For example , would variable 
Sl6 be significant if there had been data on a much more extensive 
portion of the curriculum, that is, if the study included all of 
the theorems on addition? By that time, presumably the axioms would 
have been well "learned" and perhaps variable Sl6 would no longer be of 
importance. At the present juncture in the research on student performance 
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on logic problems, one may reasonably relegate such considerations to the 
status of "second-order" effects, bat in the more refined stages of analysis 
they mast be serioasly considerd. 

The following is a list of the suggested new structural variables: 

N1 Sequential variable (0,1). Takes the value one 
if the preceding problem is of a different type, 
zero otherwise. 

N2 Measure of complexity of sentence to be derived. 

N3 Measure of the complexity of relevant premises. 

N4 Measure of the complexity of irrelevant premises. 

N5 Number of different rules used in derivation. 

N6n Number of rules in the standard proof requiring 
n arguments. 

In addition to providing some first insights into the factors 
affecting problem difficulty, the present study yielded several other 
valuable results. First, the study resulted in a precise and intuitively 
satisfying definition of problem difficulty and provided a method of 
measuring it in terms of student protocols. Second, a large data base 
of student performance in elementary mathematical logic has been established 

from which it is possible to extract much more detailed information. It is 

hoped that other researchers and those interested in the teaching of logic 
will make use of this data base to further their understanding of student 
performance. 

The effort to understand problem solving in mathematical logic should 
not stop with regression models. Suppes (1969) pointed out that the main 
conceptual weakness of the regression models is that they do not provide 
an explicit temporal analysis of the steps being taken by a student in 
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solving a problem." He then gave an example from research on arithmetic 
performance of elementary-school students which illustrates how an 
automaton model provides a natural tool for the analysis of data in 
arithmetic-problem solving. 

Any mature theory of problem solving must account for the temporal 
sequence which a student goes through in solving a problem. That is, it 
must provide meaningful dynamic links of the variables which affect prob- 
lem difficulty, variables such as those identified in this study. An 
automaton model would appear to be one of the more interesting possibilities 
for this purpose. Since all automata are, at least theoretically, program- 
mable on a computer, the terms "automata" and "computer" will be used 
inter changably in the sequel. 

The development of such models is possible, but the form that they 
should take is not yet clear. At present, there exist a number of com- 
puter programs which are able to prove theorems, i.e,, solve problems 
such as those in the curriculum we have studied. However, the problems 
involved in developing the models ar.e quite serious. First, we must find 
a theorem prover which "solves" problems in a manner analogous to the logic 
student. For example, a theorem prover based on the resolution principle 
(Robinson, 1965) is not appropriate. Then to analyze the student data, 
we must go from a deterministic model to a probabilistic one, that is, we 
must parameterize the model in such a way that it provides a good account 
of the performance data. In the case of arithmetic problems the struc- 
tural variables identified in the regression models were of great value 
in parameterizing the atuomaton model. 

I fully realize that the results presented here are still a long 
way from providing a fully adequate account of problem-solving in 
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mathematical logic. I do, however, feel that I have taken an important 
initial step and gained some valuable insights into factors involved in 
performance on logic problems. 
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APPENDIX A 



Examples of the different problem modes on the LIS 



I. Multiple choice: 

502 . 1 : 

IN THIS LESSON YOU WILL LEARN HOW TRUTH OR FALSITY OF A COMPLEX 
FORMULA IS RELATED TO THE TRUTH OR FALSITY OF ITS SUBFORMULAE. FOR 
INSTANCE, IF S IS FALSE, THEN -i S IS ... 

A) TRUE 

B) NOT TRUE 

A 

CORRECT 

502 . 2 : 

IF S IS TRUE, THEN S IS ... 

A) TRUE 

B) NOT TRUE 

B 

CORRECT 

II. Truth analysis mode: 

503 . 23 : 

FIND THE TRUTH VALUES OF THE FOLLOWING: 

S&(R&S) 

T S 

F R 

LET’S COMPUTE THE TRUTH VALUES FOR ALL SUBFORMULAE 
AND THEN FOR THE FORMULA ITSELF 

F R&S 
F S&(R&S ) 

CORRECT 

502 . 36 : 

FIND THE TRUTH VALUE OF THE FOLLOWING: 



o 




— i ( — i QV — i R)V -i (QjSsR) 

T Q 

T R 

LET'S COMPUTE THE TRUTH VALUES FOR ALL SUBFORMULAE 
AND THEN FOR THE FORMULA ITSELF 

F -i Q 

F —i R 

F — i QV — i R 

T -i (-1 QV -i R) 

T Q&R 
F (Q&R) 

T — i ( — i Q V — i R) V —i (Q$eR) 

CORRECT 

III. * Counterexample mode: 



505 . 17 : 



FIND A COUNTEREXAMPLE: 



Qi&S -> R 



P 

P 

P 

CEX 



(1) QVR 

(2) R *> (Q -» S) 

(3) (S -» R) -> Q 

YOU ARE LOOKING FOR A COUNTEREXAMPLE. THUS, THE 
CONCLUSION MUST BE FALSE AND THE PREMISES TRUE. 

NOW ASSIGN CONSISTENT TRUTH VALUES TO THE REMAINING 
SENTENCE LETTERS. 



T 

F 

T 



Q 

R 

S 



LET US CHECK TO SEE IF YOUR TRUTH ASSIGNMENTS MAKE 
THE PREMISES TRUE AND THE CONCLUSION FALSE. 



T QVR 

T Q -» S 

T R-»(Q->S) 

F S R 

T (S -» R) -» Q 

YOUR ASSIGNMENT MAKES ALL THE PREMISES TRUE. 
LET'S CHECK THE CONCLUSION. 

T Q&S 

F Q|3cS R 



CORRECT 



o 
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505.15: 



FIND A COUNTEREXAMPLE: -1 R 

P (l) -i QV - 1 R 

P (2) -i Q 

CEX YOUR ARE LOOKING FOR A COUNTEREXAMPLE. THUS, THE 

CONCLUSION MUST BE FALSE AND THE PREMISES TRUE. 

NOW ASSIGN CONSISTENT TRUTH VALUES TO THE REMAINING 
SENTENCE LETTERS. 

F Q 

T R 

LET US CHECK TO SEE IF YOUR TRUTH ASSIGNMENTS MAKE 
THE PREMISES TRUE AND THE CONCLUSION FALSE. 



F -n R 

T Q,V -i R 

YOUR ASSIGNMENT MAKES ALL THE PREMISES TRUE. 
LET’S CHECK THE CONCLUSION. 

F -i R 

CORRECT 



IV. Derive mode: 

505.25: 

DERIVE OR FIND A COUNTEREXAMPLE: QVR 



p 


(1) 


S -> Q 


p 


(2) 


“i S — > R 


p 


(3) 


S 


PER 








OK. . 


.CONSTRUCT A DERIVATION OR PROOF. 


1.3AA 


(^) 


A 


4FD 


(5) 


(ft)V(R) 



o 

ERIC 

SifinmiiiTiffTnaaa 




CORRECT 



413.33: 






DERIVE: 


a=6 


-> 5+2=A+l 


WP 


(1) 


a=6 


1AE 






: 1 


(2) 


a+i=6+i 


2CE1 


(3) 


6+l=A+l 


ND6 


(4) 


6=5+1 


TTFrei 


(5) 


(5+l)+l=A+l 


5AR2 


(6) 


5+(l+l)=A+l 


KD2 


(7) 


2=1+1 


7CE1 


(8) 


1+1=2 


£78rE1 


(9) 


5+2=A+l 


1.9CP 


(10) 


A=6 -> 5+2=A+l 


CORRECT 
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APPENDIX B 



( 



A List of The Rules of Inference, Theorems and 
Axioms Used in LIS 



1. Sentential Variables :.Q,R,S,U,W 



2. Rules of Inference: 



(a) 


AA: 


(b) 


WP: 


(c) 


DN: 


(d) 


FC: 


(e) 


RC: 


(f) 


LC: 


(g) 


FD: 


(h) 


DD: 


(i) 


DLL: 



Affirm the Antecedent, 
Working Premise, 

Double Negation, 

Form a Conjunction, 
Right Conjunct, 

Left Conjunct, 

Form a Disjunct , 

Deny Disjunct , and 
Delete last line. 



3. Derivation or Proof Procedures: 

(a) CP: Conditional Proof, and 

(b) IP: Indirect Proof. 

1. Numerical Variables: A,P-,C,D,E. 

2. Rules of Inference: 



(a) 


m 


Number Definition, 


(b) 


CE 


Commute Equals , 


(c) 


AE 


Add Equals, 


(d) 


SE 


Subtract Equals, 


(e) 


LT 


Rule of Logical Truth, and 


(f) 


RE 


Replace Equals. 



o 
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3. Axioms for Addition: 

(a) CA (Commute Addition): A+B=B+A 

(b) AS (Associate Addition): (A+B)+C=A+(B+C) 

(c) Z (Zero Axiom): A+0=A 

(d) N (Negative Number Axiom): A+(-B)=A-B 

(e) AI (Additive Inverse Axiom): A+(-A)=0 



Theorems on 


Addition: 


Theorem 1: 


0+A=A 


Theorem 2: 


( -A) +A=0 


Theorem 3: 


A-A=0 


Theorem 4: 


0 

1 

> 

II 

1 

> 


Theorem 5: 


o 

1 

II 

o 


Theorem 6: 


A-0=A 


Theorem 7: 


A+B=A+C -4 B=C 


Theorem 8: 


A+B=C -4 A=C-B 


Theorem 9: 


A=C -B -4 A+B=C 


Theorem 10: 


A+B=0 —4 A=-B 


Theorem 11: 


A=-B -4 A+B=0 


Theorem 12: 


A+B=A —4 B=0 


Theorem 13: 


-(-A)=A 


Theorem l4: 


( - (a+b) )+B=-A 


Theorem 15: 


-(a+b) =(-A) -b 


Theorem l6: 


( -A) -B=(-B) - A 


Theorem 17: 


-(A-B) =B-A 


Theorem l8: 


(A-B) -C=A+((-B)-C) 


Theorem 19: 


(A-B) -C=A-(B+C) 


Theorem 20: 


a+(b~a)=b 


Theorem 21: 


a-(a+b)=-b 


Theorem 22: 


(A-B)+(B-C) =A-C 


Additional Rules of Inference: 


(a) ME: 


Multiply Equals, and 


(b) DE: 


Divide Equals. 




o 
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6. Axioms for Multiplication: 

(a) CM (Commute Multiplication): 

(b) - MS (Associate Multiplication) : 

(c) MU (Multiplication by Unity): 

(d) MI (Multiplicative Inverse): 

(e) FR (Axiom for Fraction): 

(f) U (Unity Axiom): 

(g) DL (Distributive Law): 



AXB=BXA 

(axb)xc=ax(bxc) 

AX1=A 

-i A=0 -4 AX(l/A) = 1 
B=0 -4 A/B=AX(l/B) 
-1 1 = 0 

AX(B+C)=(AXB)+(AXC). 



7. Theorems on Multiplication: 



Theorem JO 


1XA=A 


Theorem 31 


-i A=0 -4 (l/A)XA=l 


Theorem 32 


1/1=1 


Theorem 33 


A/1=A 


Theorem 34 


-1 A=0 — » A/ A=1 


Theorem 35 


-1 B=0&A/B=0 -4 A=0XB 


Theorem 36 


(B+C)XA=(BXA)+(CXA) 


Theorem 37 


AX0=0 


Theorem 38 


-i A=0 -4 -i l/A=0 


Theorem 39 


-i A=0 -4 0/A=0 


Theorem 40 


-i A=0&AXB= 1 -4 B=l/A 


Theorem 4l 


-i A=0&AXB=A -4 B=1 


Theorem 42 


-i B=0 -4 (A/B)XC=(AXC)/B 


Theorem 43 


-i B=0 -4 (A/B)XC=(C/B)XA 


Theorem 44 


-i B=0& i D=0 -4 (A/B)X(C/D)=(C/B)X(A/D) 


Theorem 45 


—i A=0& -l B=0 — > ( A/b)X(b/A) =X 


Theorem 46 


-i A=0&AXB=AXC B=C 


Theorem 47 


-i A=0&AXB=0 -4 B=0 


Theorem 48 


-i AXB=0 -> -i A=0& -i B 0 


Theorem 49 


— i A=0& — 1 B=0 — > —i AXB=0 


Theorem 50 


-i A=0& -i B=0 -4 B/(AXB)=l/A 


Theorem 51 


-i A=0& -i B=0 -4 (CXB)/(AXB)=C/A 


Theorem 52 


(-i B=0& -i D=0)&A/B=C/D -4 AXD =CXB 


Theorem 53 


-i B=0&A=BXC -> A/B=C 


Theorem 54 


ax(-b)=-(axb) 


Theorem 55 


( -a)x(-b) =axb 
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8. Ordering Axioms: 



(a) 


NS (Asymmetry) : 


A < B ->-1 B < A 


(b) 


AD (additivity): 


A < B A+C < B+C 


(c) 


MD (Multiplicativity) : 


A < B&O < C -> AXC < BXC 


(a) 


TR (transitivity): 


A < B&B < C -» A < C 


(e) 


CN (connectivity): 


A/B -> A < BVB < A 



9. Theorems on Inequalities: 



Theorem 60 


-1 A < A 


Theorem 6l 


A=B iA<B&“iB<A 


Theorem 62 


A < B -»-i A=B& -1 B < A 


Theorem 63 


A < 0 ->0 < - A 


Theorem 6k 


0<-A-»A<0 


Theorem 65 


A+B < A+C -» B < C 


Theorem 66 


A<B-»-B<-A 


Theorem 67 


-B<-A->A<B 


Theorem 68 


A +(-B) < A + (-C) -» C < B 


Theorem 69 


C < B -» A + (-B) < A + ( -C) 


Theorem 70 


A < O&B < C -» AXC < AXB 


Theorem 71 


A < O&AXB < AXC -» C < B 


Theorem 72 


0 < A&AXB < AXC -> B < C 


Theorem 73 


0 < 1 


Theorem 7^ 


A < 0 -> l/A < 0 


Theorem 75 


0 < A&(B < O&C < 0) AXB < BXC 


Theorem 76 


A < 0&(0 < B&O < C) -» AXB < BXC 


Theorem 77 


-1 B=0&0 < A/B -> 0 < AXB 


Theorem 78 


-1 B=0&0 < AXB -> 0 < A/B 
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Boolean or Class Algebra 



1. Class Variables: G,H,MjKjL 

2 . Axioms : 

(a) CU (Commute Union): 

(b) Cl (Commute Intersection): 

(c) UI (Union Identity): 

(d) II (Intersection Identity): 

(e) DU (Distribute Union): 

(f) DI (Distribute Intersection): 

(g) EM (Excluded Middle): 

(h) RD (Reduction): 

(i) UC (Associate Union): 

(j) IA (Associate Intersection): 

(k) SA (Subclass Axiom): 

(l) CS (Converse of Subclass): 



G U H = H U G 
G fl H = H D G 
G U 0 = G 
G 0 X = G 

G u (H f|K) = (G UH) n (G UK) 

GO (H UK) = (gH H ) U (GflK) 
G U (-G) = X 
G 0 ( -G) = 0 

(GUH) UK = GU(HUK) 

(g n h) n k = g n (h n k) 

G fl ( -H) = 0->GCH 
G C H -> G n (-H) = o 



3 . 



Theorems: 




Theorem l6l: 


GU ((-G) 0 H) = GUH 


Theorem 162: 


GO ((-G) UH) = GflH 


Theorem 163: 


G U G = G 


Theorem l64: 


G fl G = G 


Theorem 165 : 


GUX = X 


Theorem 166: 


G fl 0 = 0 


Theorem 167: 


G U (G fl H) = G 


Theorem l68: 


G n (G UH) = G 


Theorem 169: 


G n ( -H) = 0 &Gf|H= 0 ->G = 0 


Theorem 170: 


G U (**H) = X&G U H = X — » G = X 


Theorem 171: 


gUh = o->g = o 


Theorem 172: 


GflH = X -» G = x 


Theorem 173: 


G U H = G U K&G f)H = GnK->H = K 


Theorem 17^: 


(G U H = X&G U K = X)&(G fl H = 0&G f)K = 0) ->H = K 


Theorem 175: 


(G U H = G&G U K = G)&(G f|H=0&GnK = 0)->H=K 


Theorem 176: 


(G U H = X&G UK = X)&(G Pi H = G&G fl K = G) ^ H = K 


Theorem 177: 


-(-G) = G 



-x=o 



Theorem 178: 
Theorem 179s 
Theorem l80: 
Theorem 190: 
Theorem 191: 
Theorem 192: 
Theorem 193: 
Theorem 19^: 
Theorem 195: 
Theorem 196: 
Theorem 197: 
Theorem 198: 
Theorem 199: 
Theorem 200: 
Theorem 201: 
Theorem 202: 
Theorem 203: 
Theorem 20^: 
Theorem 205: 
Theorem 206: 
Theorem 207: 
Theorem 208: 



G U H = G 0 H -> G = H 

g n (h n k) = (g Oh) n (g n k) 

G C G 
0 c G 
G C X 

G c H&H c G -» G = H 
gch^gUh=h 
G U H = H -> G c H 
gU(-h) - X G C H 
G c H ->G U( -H) = X 

gch-»gHh = g 

G Pi H = G -> G c H 
GC1HCK-»GCK 
G c H -H c -G 
G ci H&G ci -H — > G = 0 
G c H&-G c H -» H = X 
G c G IJH 
GflHCG 

GCK&HCK^GIJHCK 
GCH&GCK-»GCHflK 
G C H ->H = G U (H fl (-G)) 



APPENDIX C 



Two Examples of Derivation Problems from LIS 



This appendix contains two examples of derivation problems from 
LIS. Example 1 is typical of the sentential logic problems. Example 2 

is typical of the algebra problems. 

An explanation of the lines of the derivation in Example 1 follows: 



(l) - (5) These are the given premises to be used in deriving the 
logical sentence R. 





( 8 ) 

(9) 



( 10 ) 

(n) 

( 12 ) 



The student introduces the denial of the sentence to be 
derived. To do this, he uses the working premise rule, WP. 
LIS indents this premise and all lin-*s following it until 
the student proves a contradiction and uses the indirect 
proof rule, IP, to derive the denial of what he entered on 
this line. See the explanation for line l4 (below). 

Line 1 is a disj unction and the newly introduced line 6 is 
the denial of one of the disjunct s. The DD rule (Deny 
Disjunct) allows the student to establish the truth of the 
other disjunct S. 

Line 2 is the conditional "if not Q, then not S." Line 7 
states that S is true, so the student used deny consequent, 
DC, to prove that Q is true. 

The antecedent of the conditional in the line 3 premise is in 
the form of a double negation (not (not Q)): the student has 
proved that Q is true in line 8, so he uses double negation, 
DN, to derive this antecedent. 

Now he uses the affirm the antecedent rule, AA, to derive the 
consequent of line 3* 

He uses double negation again, now on the premise line 4. 

He uses affirm the antecedent again to derive not W. 
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(]_3) He uses deny disjunct again, this time on the disjunct on 

line 12 to get not S. 

(14) He has derived a contradiction with the help of the working 

premises he introduced. On line 7 he has S is true. On 
line 13 he has not S is true. He uses the indirect proof 
rule, IP, to establish the denial of not R, the working 
premise on line 6. 



Insert Table 1 about here 



Now we give a detailed explanation of the steps in the derivation 
of Example 2. There are no premises and the student is being asked to 
prove Theorem 22 which will then become available to him for use in 
later proofs. 

(1) The student introduces the negative number axiom, N. The 
computer prints out the axiom and then allows the student 
to substitute expressions for A and B. In this case, the 
student types A for A and B for B. 

(2) Line 1 is an equation, so the student can commute about =. 

To do this, he uses the commute equals rule, CE, where the 
first 1 is the line number and the second 1 is the occurrence 
number of the =. 

(3) The student wishes to add something to both sides of the 
equality on line 2. To do this, he uses the add equals 
rule, AE, where the 2 is the line number of the equation. 

The computer types a colon after which the students types 
the expression to be added. The computer then types line 

3 . 

On the next line the student attempts to type a rule which 
the computer does not recognize. 

The student again uses the negative number axiom. 

He applies CE to line 4 . 




( 4 ) 

( 5 ) 





( 8 ) 

(9) 

( 10 ) 

(n) 



( 12 ) 

(13) 

(S0-(15) 



The student now uses the replace equals rule, RE. He wishes 
to replace an occurrence in line 3 of the left-hand side of 
the equation in line 5 by the right-hand side of the equation 
in line 5. There is more than one occurrence of (B-C) in 
line 3 and the student specifies which one he wants replaced 
by the occurrence number, 1. 

On the next line he decides to erase line 6. He does this 
by using DLL, delete the last line. 

He again uses RE, this time for the second occurrence of 
(B-C) . The student wishes to associate addition to the 
right in line 6. To do this, he uses the associate right 
rule, AR. He wants to associate about the second plus sign, 
hence he uses 2 as the occurrence number. Since this is not 
possible* he •receives an error- message.. 

He again tries AR, only this time the occurrence number of 
the plus sign is 3* 

He associates left about the third plus sign using AL. 

He uses the negative number axiom again. 

He now makes use of a theorem which he had proved earlier. 

A theorem is used in a manner analogous to the axioms. On 
the next line he misuses RE and receives the appropriate 
error message . 

He correctly uses RE. 

He makes use of theorem 1. 

Two more uses of RE establish the desired theorem. 



Insert Table 2 about here 



TABLE 1 



Example 1 



400.1 



DERIVE R 



p 


(1) 


RVS 




p 


(2) 


— 1 Q -> — 1 S 




p 


(3) 


-1 (-i Q) -» (U -» -iW) 




p 


00 


-1 (-1 u) 




p 


(5) 


WV S 




WP 


(6) 




—i 'R 


i.6dd 


(7) 




s 


2.7DC 


(8) 




Q 


8DN 


(9) 




-1 (-1 Q) 


3.9AA 


(io) 




U -» -1 w 


4DN 


(11) 




u 


10.] 1AA 


( 12 ) 




-1 w 


5.12LD 


(13) 




— i s 


6.7.13IP 

CORRECT 


m 


R 





o 
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TABLE 2 



Example 2 



417.31: 
HERE IS 


THEOREM 22 




DERIVE: 


(A-B) + (B-C) 


= A-C 


N A + 


( -B) = A-B 




A:A 

B:B 


(1) 


A + ( -B) = A-B 


1CE1 


(2) 


A-B = A + (-B) 


2AE 
: B^C 


(3) 


(A-B) + (B-C) = (A + ( -B)) + (B-C) 


NU 


NU IS NOT A RULE 


N A + 


(-B) = A-B 




A:B 

B:C 


00 


B + (-C) = B-C 


4cei 


(5) 


B-C = B + (-C) 


3.5KE1 


(6) 


(A-B) + (B + (-C) ) = (A + (-B) ) + (B-C) 


DLL 

3.5RE2 


(6) 


(A-B) + (B-C) = (A +( -B) ) + (B + (-C)) 


6AR2 








YOU MAY 


NCyr ASSOCIATE RIGHT AROUND . +. SIGN NUMBER 2 


6AR3 


(7) 


(A-B) + (B-C) = A + ((-B) + (B + ( -C ) ) ) 


7AL3 


(8) 


(A-B) + (B-C) = A + (((-B) + B) + (-C))' 



N A + (~B) = A-B 
A: A 



B: C (9) A + (-C) = A-C 

TH2 (-A) + A, = 0 

A:B (10) ( -B ) + B = 0 

8.10RE2 

THERE ARE NOT 2 OCCURRENCES OF (-B) + B IN LINE ^8 
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8.10RE1 


(11) 


(A-B) + (B-C) = A + (0 + (-C) ) 


TH2 0 + A 


= A 




A: 


(12) 


0 + (-C) = -C 


11.12RE1 


(13) 


(A-B) + (B-C) = A + (-C) 


13.9BB1 


(l*) 


(A-B) + (B-C) = A-C 


CORRECT 
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APPENDIX D 



Description of Data Analysis Programs 



In this appendix we describe the programs, written by the author, 
which were used to reduce and analyze the data. 

Logic Program 

Each day during the summer of 1970, a fil e was created for each 
logic student on the PDP-10 disk file system. lies were identified on 
the disk by a file name (up to six characters) and a file extension (up 
to three characters) written as NNNNNN.EEE. The name chosen for each 
student file was the student’s account number, the extension was the 
date. Thus, logic student L1125 on July 13 had his data recorded on a 
file named L1125.713. At the end of each day, the student data files 
were transferred to magnetic tape. The format of these files is given 

in Table 1. 

Insert Table 1 about here 

Data Reduction 

In the fall of 1970, a series of programs were written to convert 
the raw data into a format acceptable to the BMD programs. We give here 
a brief description of these programs, indicating the programming 
language used in each case. 

PASS1 - PDP-10 assembly language 

Input: daily student data files 

(1) combined the data in the individual student files 
described above into one data file per student. 
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PASS2 - SAIL* 



Input: output files from PASS1 

(1) created a separate file for each logic problem. 



PASS 3 - SAIL 



Input: output files from PASS2 

(l) Extracted the following information from each problem file 

(a) problem number 

(b) number of students who attempted the problem 

(c) number of students for whom there was complete data 

on the problem. As mentioned in Chapter III, some data 

* 

were lost due to system or machine failures, so that there 
were incomplete data for some students on some problems. 

(d) mean and standard deviation of the number of lines 
in a complete derivation for the problem. Here and below 
we define the mean as: 



where N is the number of students completing the 
problem. 

(e) mean and standard deviation of latency to solution. 

(f) mean and standard deviation of latency per line. 

(g) mean and standard deviation of corrected latency per 



*Stanford Artificial Intelligence Laboratory's Algol-like language. 



N 

Mean = X = ( S X. )/N 
i=l 



and the standard deviation as: 



Stan. Dev.= (X. - X) 2 )/N-l 

1 




line . 



(h) mean and standard deviation of number of error messages. 

(i) mean and standard deviation of number of DLL's. 

(j) mean and standard deviation of number of restarts. 

(2) created ASCII files of the above information formated for 

printing on a teletype or displaying on a CRT. These could 
also be used as input for the BMD programs. 

COMB - Fortran 

Input: output from PASS? and a file containing the values of the 
structural variables which were typed as input by hand on 



the CRT's. 

(l) combined the two input files into one file containing both 
the behavioral and structural variables. 

SORT - Fortran 

Input: output from PASS? 

(l) produced a rank-ordering of the problems for each of the 
five behavioral measures. 

Analysis 

In addition to writing the above programs, I also implemented the 
BMD06M program on the PDP-10 and modifisd the already existing BMD02R 
program to produce the plots mentioned in Chapter III. 




TABLE 1 



Format of Raw Logic Data 



The first four words of each student file were: 

wordl: Student account number 
word2: Date 
word3: Start time 

word4: New day code - 76l6l6l6l6l6 

Whenever the student was restarted, the above four words were put in 
his file. 

The first words for every problem were: 

wordl: New problem code - 7l6l6l6l6l6l 

word2: Problem start time 

word3: Problem and lesson number 

words 4-n: Problem type codes 

These were followed by response codes. For each student input these were: 

wordl: response code - 767676767676 
words 2-n-l: Student response in ASCII 
wordn: Latency to response 

Each time a student timed out, the following information was recorded: 
wordl: TIMOUT 

word2: Time of the time-out 

Each time a student asked for a hint and the hint clock had not fired, 
the student received one of the following two messages. For A HINT IS 
NOT! AVAILABLE NOW" we recorded: 

wordl: NOTNOW 
word2: Time of message 

When the student received ’’THINK. A LITTLE LONGER", we recorded: 

wordl: KEEPON 
word2: Time of message 

When the student received an error message, we recorded: 
wordl: ERRORS 

word2: Error message number 

word3: Contents of an accumulator containing information about the error 
word4: Time of the error 

At the end of each problem, we recorded: 

wordl: Problem end code- 766766766766 
word2; Time of end of problem 

Finally, at the time that each student was signed off, we recorded: 

wordl: Sign-off code - 776776776776 
word2: Time of sign-off 
word3: 747474747474 
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